Skip to main content
Erschienen in: Journal of Scientific Computing 1/2013

01.10.2013

Accelerating the Arnoldi-Type Algorithm for the PageRank Problem and the ProteinRank Problem

verfasst von: Gang Wu, Ying Zhang, Yimin Wei

Erschienen in: Journal of Scientific Computing | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

PageRank is an algorithm for computing a ranking for every Web page based on the graph of the Web. It plays an important role in Google’s search engine. The core of the PageRank algorithm involves computing the principal eigenvector of the Google matrix. Currently, we need to solve PageRank problems with high damping factors, which cost considerable time. A possible approach for accelerating the computation is the Arnoldi-type algorithm. However, this algorithm may not be satisfactory when the damping factor is high and the dimension of the Krylov subspace is low. Even worse, it may stagnate in practice. In this paper, we propose two strategies to improve the efficiency of the Arnoldi-type algorithm. Theoretical analysis shows that the new algorithms can accelerate the original Arnoldi-type algorithm considerably, and circumvent the drawback of stagnation. Numerical experiments illustrate that the accelerated Arnoldi-type algorithms usually outperform many state-of-the-art accelerating algorithms for PageRank. Applications of the new algorithms to function predicting of proteins are also discussed.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
2.
Zurück zum Zitat Avrachenkov, K., Litvak, N., Nemirovsky, D., Osipova, N.: Monte carlo methods in PageRank computation: when one iteration is sufficient. SIAM J. Numer. Anal. 45, 890–904 (2007)MathSciNetMATHCrossRef Avrachenkov, K., Litvak, N., Nemirovsky, D., Osipova, N.: Monte carlo methods in PageRank computation: when one iteration is sufficient. SIAM J. Numer. Anal. 45, 890–904 (2007)MathSciNetMATHCrossRef
3.
Zurück zum Zitat Beattie, C., Embree, M., Sorensen, D.: Convergence of polynomial restart Krylov methods for eigenvalue computation. SIAM Rev. 47, 492–515 (2005)MathSciNetMATHCrossRef Beattie, C., Embree, M., Sorensen, D.: Convergence of polynomial restart Krylov methods for eigenvalue computation. SIAM Rev. 47, 492–515 (2005)MathSciNetMATHCrossRef
4.
Zurück zum Zitat Bellalij, M., Saad, Y., Sadok, H.: Further analysis of the Arnoldi process for eigenvalue problems. SIAM J. Numer. Anal. 48, 393–407 (2010)MathSciNetMATHCrossRef Bellalij, M., Saad, Y., Sadok, H.: Further analysis of the Arnoldi process for eigenvalue problems. SIAM J. Numer. Anal. 48, 393–407 (2010)MathSciNetMATHCrossRef
6.
Zurück zum Zitat Berman, A., Plemmons, R.: Nonnegative Matrices in the Mathematical Sciences, 2nd edn. SIAM, Philadelphia (1994)MATHCrossRef Berman, A., Plemmons, R.: Nonnegative Matrices in the Mathematical Sciences, 2nd edn. SIAM, Philadelphia (1994)MATHCrossRef
7.
Zurück zum Zitat Boldi, P., Santini, M., Vigna, S.: PageRank: functional dependencies. ACM Trans. Inf. Syst. 27(1) (2009) Boldi, P., Santini, M., Vigna, S.: PageRank: functional dependencies. ACM Trans. Inf. Syst. 27(1) (2009)
8.
9.
Zurück zum Zitat Chen, Z., Cai, Z., Li, M., Liu, B.: Using search engine technology for protein function prediction. Inter. J. Bio. Res. Appl. 7, 101–113 (2011)CrossRef Chen, Z., Cai, Z., Li, M., Liu, B.: Using search engine technology for protein function prediction. Inter. J. Bio. Res. Appl. 7, 101–113 (2011)CrossRef
10.
Zurück zum Zitat Cicone, A., Serra-Capizzano, S.: Google PageRanking problem: the model and the analysis. J. Comput. Appl. Math. 234, 3140–3169 (2010)MathSciNetMATHCrossRef Cicone, A., Serra-Capizzano, S.: Google PageRanking problem: the model and the analysis. J. Comput. Appl. Math. 234, 3140–3169 (2010)MathSciNetMATHCrossRef
11.
Zurück zum Zitat Cipra, B.: The best of the 20th century: editors name top 10 algorithms. SIAM News 33(4) (2000) Cipra, B.: The best of the 20th century: editors name top 10 algorithms. SIAM News 33(4) (2000)
13.
Zurück zum Zitat Del Corso, G., Gullì, A., Romani, F.: Comparison of Krylov subspace methods on the PageRank problem. J. Comput. Appl. Math. 210, 159–166 (2007)MathSciNetMATHCrossRef Del Corso, G., Gullì, A., Romani, F.: Comparison of Krylov subspace methods on the PageRank problem. J. Comput. Appl. Math. 210, 159–166 (2007)MathSciNetMATHCrossRef
14.
Zurück zum Zitat Freschi, V.: Protein function prediction from interaction networks using a random walk ranking algorithm. In: Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, 14–17, pp. 42–48 (2007) Freschi, V.: Protein function prediction from interaction networks using a random walk ranking algorithm. In: Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, 14–17, pp. 42–48 (2007)
15.
Zurück zum Zitat Gleich, D., Zhukov, L., Berkhin, P.: Fast Parallel PageRank: A Linear System Approach. Technical Report, Yahoo! (2004) Gleich, D., Zhukov, L., Berkhin, P.: Fast Parallel PageRank: A Linear System Approach. Technical Report, Yahoo! (2004)
16.
Zurück zum Zitat Gleich, D., Gray, A., Greif, C., Lau, T.: An inner-outer iteration for computing PageRank. SIAM J. Sci. Comput. 32, 349–371 (2010)MathSciNetMATHCrossRef Gleich, D., Gray, A., Greif, C., Lau, T.: An inner-outer iteration for computing PageRank. SIAM J. Sci. Comput. 32, 349–371 (2010)MathSciNetMATHCrossRef
18.
Zurück zum Zitat Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)MATH Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)MATH
19.
Zurück zum Zitat Grindrod, P.: Range-dependent random graphs and their application to modelling large small-world proteome datasets. Phys. Rev E. 66, 066702 (2002)CrossRef Grindrod, P.: Range-dependent random graphs and their application to modelling large small-world proteome datasets. Phys. Rev E. 66, 066702 (2002)CrossRef
20.
Zurück zum Zitat Haveliwala, T., Kamvar, S.: The second eigenvalue of the Google matrix. Stanford University Technical Report (2003) Haveliwala, T., Kamvar, S.: The second eigenvalue of the Google matrix. Stanford University Technical Report (2003)
21.
Zurück zum Zitat Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)MATHCrossRef Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)MATHCrossRef
22.
Zurück zum Zitat Ipsen, I., Selee, T.: PageRank computation, with special attention to dangling nodes. SIAM J. Matrix Anal. Appl. 29, 1281–1296 (2007)MathSciNetCrossRef Ipsen, I., Selee, T.: PageRank computation, with special attention to dangling nodes. SIAM J. Matrix Anal. Appl. 29, 1281–1296 (2007)MathSciNetCrossRef
23.
Zurück zum Zitat Jia, Z.: Refined iterative algorithms based on Arnoldi’s process for large unsymmetric eigenproblems. Linear Algeb. Appl. 259, 1–23 (1997)MATHCrossRef Jia, Z.: Refined iterative algorithms based on Arnoldi’s process for large unsymmetric eigenproblems. Linear Algeb. Appl. 259, 1–23 (1997)MATHCrossRef
24.
Zurück zum Zitat Jia, Z., Stewart, G.W.: An analysis of the Rayleigh-Ritz method for approximating eigenspaces. Math. Comput. 70, 637–647 (2001)MathSciNetMATH Jia, Z., Stewart, G.W.: An analysis of the Rayleigh-Ritz method for approximating eigenspaces. Math. Comput. 70, 637–647 (2001)MathSciNetMATH
25.
Zurück zum Zitat Kamvar, S., Haveliwala, T., Manning, C., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Proceedings of the 12th Conference on International, World Wide Web (2003) Kamvar, S., Haveliwala, T., Manning, C., Golub, G.H.: Extrapolation methods for accelerating PageRank computations. In: Proceedings of the 12th Conference on International, World Wide Web (2003)
26.
Zurück zum Zitat Kamvar, S., Haveliwala, T., Golub, G.H.: Adaptive methods for the computation of PageRank. Linear Algeb. Appl. 386, 51–65 (2004)MathSciNetMATHCrossRef Kamvar, S., Haveliwala, T., Golub, G.H.: Adaptive methods for the computation of PageRank. Linear Algeb. Appl. 386, 51–65 (2004)MathSciNetMATHCrossRef
27.
Zurück zum Zitat Kollias, G., Gallopoulos E.: Functional rankings with multidamping: Generalizing PageRank with inhomogeneous matrix products (submitted) Kollias, G., Gallopoulos E.: Functional rankings with multidamping: Generalizing PageRank with inhomogeneous matrix products (submitted)
29.
Zurück zum Zitat Langville, A., Meyer, C.: Google’s PageRank and beyond: the science of search engine rankings. Princeton University Press, Princeton (2006) Langville, A., Meyer, C.: Google’s PageRank and beyond: the science of search engine rankings. Princeton University Press, Princeton (2006)
30.
Zurück zum Zitat Lee, C., Golub, G.H., Zenios, S.: A Fast Two-Stage Algorithm for Computing PageRank and Its Extensions. Stanford University Technical Report, SCCM-03-15 (2003) Lee, C., Golub, G.H., Zenios, S.: A Fast Two-Stage Algorithm for Computing PageRank and Its Extensions. Stanford University Technical Report, SCCM-03-15 (2003)
31.
Zurück zum Zitat Manteuffel, T.: Adaptive procedure for estimating parameters for the nonsymmetric Tchebychev iteration. Numer. Math. 31, 183–208 (1978)MathSciNetMATHCrossRef Manteuffel, T.: Adaptive procedure for estimating parameters for the nonsymmetric Tchebychev iteration. Numer. Math. 31, 183–208 (1978)MathSciNetMATHCrossRef
32.
Zurück zum Zitat Moler, C.: The World’s Largest Matrix Computation. MATLAB News and Notes (2002) Moler, C.: The World’s Largest Matrix Computation. MATLAB News and Notes (2002)
33.
Zurück zum Zitat Morrison, J., Breitling, R., Higham, D., Gilbert, D.: A lock-and-key model for protein-protein interactions. Bioinformatics 2, 2012–2019 (2006)CrossRef Morrison, J., Breitling, R., Higham, D., Gilbert, D.: A lock-and-key model for protein-protein interactions. Bioinformatics 2, 2012–2019 (2006)CrossRef
34.
Zurück zum Zitat Nachtigal, N., Reichel, L., Trefethen, L.: A hybrid GMRES algorithm for nonsymmetric linear systems. SIAM J. Matrix Anal. Appl. 13, 796–825 (1992)MathSciNetMATHCrossRef Nachtigal, N., Reichel, L., Trefethen, L.: A hybrid GMRES algorithm for nonsymmetric linear systems. SIAM J. Matrix Anal. Appl. 13, 796–825 (1992)MathSciNetMATHCrossRef
35.
Zurück zum Zitat Page, L., Brin, S., Motwami, R., Winograd, T.: The PageRank citation ranking: bring order to the Web, Technical Report. Computer Science Department, Stanford University (1998) Page, L., Brin, S., Motwami, R., Winograd, T.: The PageRank citation ranking: bring order to the Web, Technical Report. Computer Science Department, Stanford University (1998)
36.
37.
38.
Zurück zum Zitat Saad, Y.: Numerical Methods for Large Eigenvalue Problems, Algorithms and Architectures for Advanced Scientific Computing. Manchester University Press, Manchester (1992) Saad, Y.: Numerical Methods for Large Eigenvalue Problems, Algorithms and Architectures for Advanced Scientific Computing. Manchester University Press, Manchester (1992)
39.
Zurück zum Zitat Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)MATHCrossRef Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)MATHCrossRef
40.
Zurück zum Zitat Serra-Capizzano, S.: Jordan canonical form of the Google matrix: A potential contribution to the PageRank computation. SIAM Matrix Anal. Appl. 27, 305–312 (2005)MathSciNetMATHCrossRef Serra-Capizzano, S.: Jordan canonical form of the Google matrix: A potential contribution to the PageRank computation. SIAM Matrix Anal. Appl. 27, 305–312 (2005)MathSciNetMATHCrossRef
41.
Zurück zum Zitat Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007)CrossRef Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007)CrossRef
42.
Zurück zum Zitat Sidi, A., Shapira, Y.: Upper bounds for convergence rates of acceleration methods with initial iterations. Numer. Algeb. 18, 113–132 (1998)MathSciNetMATHCrossRef Sidi, A., Shapira, Y.: Upper bounds for convergence rates of acceleration methods with initial iterations. Numer. Algeb. 18, 113–132 (1998)MathSciNetMATHCrossRef
43.
Zurück zum Zitat Sidi, A.: Vector extrapolation methods with applications to solution of large systems of equations and to pagerank computations. Comput. Math. Appl. 56, 1–24 (2008)MathSciNetMATHCrossRef Sidi, A.: Vector extrapolation methods with applications to solution of large systems of equations and to pagerank computations. Comput. Math. Appl. 56, 1–24 (2008)MathSciNetMATHCrossRef
44.
Zurück zum Zitat Sorensen, D.: Implicit application of polynomial filters in a $k$-step Arnoldi method. SIAM J. Matrix Anal. Appl. 13, 357–385 (1992)MathSciNetMATHCrossRef Sorensen, D.: Implicit application of polynomial filters in a $k$-step Arnoldi method. SIAM J. Matrix Anal. Appl. 13, 357–385 (1992)MathSciNetMATHCrossRef
45.
Zurück zum Zitat Stewart, G.W., Sun, J.: Matrix Perturbation Theory. Academic Press, Boston (1990)MATH Stewart, G.W., Sun, J.: Matrix Perturbation Theory. Academic Press, Boston (1990)MATH
46.
Zurück zum Zitat Taylor, A., Higham, D.J.: CONTEST: A controllable test matrix toolbox for MATLAB. ACM Trans. Math. Soft. 35(26) (2009) Taylor, A., Higham, D.J.: CONTEST: A controllable test matrix toolbox for MATLAB. ACM Trans. Math. Soft. 35(26) (2009)
47.
Zurück zum Zitat Wong, L.: Using biological networks in protein function prediction and gene expression analysis. Internet Math. 7, 274–298 (2011)MathSciNetCrossRef Wong, L.: Using biological networks in protein function prediction and gene expression analysis. Internet Math. 7, 274–298 (2011)MathSciNetCrossRef
48.
49.
Zurück zum Zitat Wu, G., Wei, Y.: Arnoldi versus GMRES for computing PageRank: a theoretical contribution to Google’s PageRank problem. ACM Trans Inf. Syst. 28(11) (2010) Wu, G., Wei, Y.: Arnoldi versus GMRES for computing PageRank: a theoretical contribution to Google’s PageRank problem. ACM Trans Inf. Syst. 28(11) (2010)
50.
51.
Zurück zum Zitat Wu, K., Simon, H.: Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22, 602–616 (2000)MathSciNetMATHCrossRef Wu, K., Simon, H.: Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22, 602–616 (2000)MathSciNetMATHCrossRef
52.
Zurück zum Zitat Wu, G., Zhang, Y., Wei, Y.: Krylov subspace algorithms for computing GeneRank for the analysis of microarray data mining. J. Comput. Biol. 17, 631–646 (2010)MathSciNetCrossRef Wu, G., Zhang, Y., Wei, Y.: Krylov subspace algorithms for computing GeneRank for the analysis of microarray data mining. J. Comput. Biol. 17, 631–646 (2010)MathSciNetCrossRef
53.
Zurück zum Zitat Wu, G., Wang, Y., Jin, X.: A preconditioned and shifted GMRES algorithm for the PageRank Problem with multiple damping factors. SIAM J. Sci. Comput. 34, A2558–A2575 (2012)MathSciNetMATHCrossRef Wu, G., Wang, Y., Jin, X.: A preconditioned and shifted GMRES algorithm for the PageRank Problem with multiple damping factors. SIAM J. Sci. Comput. 34, A2558–A2575 (2012)MathSciNetMATHCrossRef
54.
Zurück zum Zitat Yu, Q., Miao, Z., Wu, G., Wei, Y.: Lumping algorithms for computing Google’s Page-Rank and its derivative, with attention to unreferenced nodes. Inform. Retriev. 15, 503–526 (2012)CrossRef Yu, Q., Miao, Z., Wu, G., Wei, Y.: Lumping algorithms for computing Google’s Page-Rank and its derivative, with attention to unreferenced nodes. Inform. Retriev. 15, 503–526 (2012)CrossRef
Metadaten
Titel
Accelerating the Arnoldi-Type Algorithm for the PageRank Problem and the ProteinRank Problem
verfasst von
Gang Wu
Ying Zhang
Yimin Wei
Publikationsdatum
01.10.2013
Verlag
Springer US
Erschienen in
Journal of Scientific Computing / Ausgabe 1/2013
Print ISSN: 0885-7474
Elektronische ISSN: 1573-7691
DOI
https://doi.org/10.1007/s10915-013-9696-x

Weitere Artikel der Ausgabe 1/2013

Journal of Scientific Computing 1/2013 Zur Ausgabe