skip to main content
research-article

CONTEST: A Controllable Test Matrix Toolbox for MATLAB

Published:01 February 2009Publication History
Skip Abstract Section

Abstract

Large, sparse networks that describe complex interactions are a common feature across a number of disciplines, giving rise to many challenging matrix computational tasks. Several random graph models have been proposed that capture key properties of real-life networks. These models provide realistic, parametrized matrices for testing linear system and eigenvalue solvers. CONTEST (CONtrollable TEST matrices) is a random network toolbox for MATLAB that implements nine models. The models produce unweighted directed or undirected graphs; that is, symmetric or unsymmetric matrices with elements equal to zero or one. They have one or more parameters that affect features such as sparsity and characteristic pathlength and all can be of arbitrary dimension. Utility functions are supplied for rewiring, adding extra shortcuts and subsampling in order to create further classes of networks. Other utilities convert the adjacency matrices into real-valued coefficient matrices for naturally arising computational tasks that reduce to sparse linear system and eigenvalue problems.

References

  1. Abello, J., Buchsbaum, A., and Westbrook, J. 1998. A functional approach to external graph algorithms. Lecture Notes in Computer Science, vol. 1461, 332--343. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Albert, R. and Barabási, A.-L. 2002. Statistical mechanics of complex networks. Rev. Modern Phys. 74, 47--97.Google ScholarGoogle ScholarCross RefCross Ref
  3. Alon, U. 2006. An Introduction to Systems Biology. Chapman and Hall/CRC, London.Google ScholarGoogle Scholar
  4. Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Croz, J. D., Greenbaum, A., Hammarling, S., McKenney, A., and Sorensen, D. 1999. LAPACK Users’ Guide, 3rd ed. SIAM, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Barabási, A.-L. and Albert, R. 1999. Emergence of scaling in random networks. Sci. 286, 5439, 509--12.Google ScholarGoogle Scholar
  6. Batagelj, V. and Brandes, U. 2005. Efficient generation of large random networks. Phys. Rev. E 71, 036113.Google ScholarGoogle ScholarCross RefCross Ref
  7. Boginski, V., Butenko, S., and Pardalos, P. M. 2003. On structural properties of the market graph. In Innovations in Financial and Economic Networks, A. Nagurney, Ed. Edward Elgar Publishers, 29--45.Google ScholarGoogle Scholar
  8. Boisvert, R., Pozo, R., Remington, K., Barrett, R., and Dongarra, J. 1997. Matrix market: A Web resource for test matrix collections. In The Quality of Numerical Software: Assessment and Enhancement, R. Boisvert, Ed. Chapman and Hall, London, 125--137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bollobás, B. 1985. Random Graphs. Academic, London.Google ScholarGoogle Scholar
  10. Bollobás, B., Riordan, O., Spencer, J., and Tusnády, G. 2001. The degree sequence of a scale-free random graph process. Random Structures Algor. 18, 279--290. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., and Wiener, J. 2000. Graph structure of the Web. Comput. Netw. 33, 309--320. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chung, F. 1997. Spectral Graph Theory. American Mathematical Society, Providence, RI.Google ScholarGoogle Scholar
  13. Conyon, M. J. and Muldoon, M. R. 2006. The small world of corporate boards. J. Business Finance Account. 33, 1321--1343.Google ScholarGoogle ScholarCross RefCross Ref
  14. Cour, T., Benezit, F., and Shi, J. 2005. Spectral segmentation with multiscale graph decomposition. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, 1124--1131. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Davis, T. 2007. The University of Florida sparse matrix collection. Tech. rep. CISE Department, REP-2007-298, University of Florida, USA.Google ScholarGoogle Scholar
  16. de Silva, E. and Stumpf, M. 2005. Complex networks and simple models in biology. J. R. Soc. Interface 2, 419--430.Google ScholarGoogle ScholarCross RefCross Ref
  17. de Silva, E., Thorne, T., Ingram, P., Agrafiot, I., Swire, J., Wiuf, C., and Stumpf, M. P. H. 2006. The effects of incomplete protein interaction data on structural and evolutionary inferences. BMC Biol. 4, 39.Google ScholarGoogle ScholarCross RefCross Ref
  18. Duff, I. S., Grimes, R. G., and Lewis, J. G. 1989. Sparse matrix test problems. ACM Trans. Math. Softw. 15, 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Erdös, P. and Rényi, A. 1959. On random graphs. Publ. Math. Debrecen 6, 290--297.Google ScholarGoogle ScholarCross RefCross Ref
  20. Fagiolo, G. 2007. Clustering in complex directed networks. Phys. Rev. 76, 026107.Google ScholarGoogle Scholar
  21. Faloutsos, M., Faloutsos, P., and Faloutsos, C. 1999. On power-law relationships of the internet topology. Comput. Commun. Rev. 29, 251--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Gilbert, E. N. 1959. Random graphs. Ann. Math. Statist. 30, 1141--1144.Google ScholarGoogle ScholarCross RefCross Ref
  23. Gilbert, J. R., Moler, C., and Schreiber, R. 1992. Sparse matrices in MATLAB: Design and implementation. SIAM J. Matrix Anal. Appl. 13, 333--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Grimmett, G. 1999. Percolation, 2nd ed. Springer.Google ScholarGoogle Scholar
  25. Grindrod, P. 2002. Range-Dependent random graphs and their application to modeling large smal-world proteome datasets. Phys. Rev. E 66, 066702.Google ScholarGoogle ScholarCross RefCross Ref
  26. Grindrod, P., Higham, D. J., and Kalna, G. 2008. Periodic reordering. Tech. rep. 6, University of Strathclyde, Department of Mathematics.Google ScholarGoogle Scholar
  27. Han, J. D. H., Dupuy, D., Bertin, N., Cusick, M. E., and M., V. 2005. Effect of sampling on topology predictions of protein-protein interaction networks. Nature Biotechnol. 23, 839--844.Google ScholarGoogle ScholarCross RefCross Ref
  28. Hendrickson, B. and Leland, R. 1994. The Chaco user’s guide: Version 2.0. Tech. rep. SAND94--2692, Sandia National Laboratories, Albuquerque, NM.Google ScholarGoogle Scholar
  29. Higham, D. J. 2003. Unravelling small world networks. J. Comput. Appl. Math. 158, 61--74. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Higham, D. J. 2005. Spectral reordering of a range-dependent weighted random graph. IMA J. Numer. Anal. 25, 443--457.Google ScholarGoogle ScholarCross RefCross Ref
  31. Higham, D. J. and Higham, N. J. 2000. MATLAB Guide. Society for Industrial and Applied Mathematics, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Higham, D. J., Pržulj, N., and Rašajski, M. 2008. Fitting a geometric graph to a protein-protein interaction network. Bioinf. 24, 1093--1099. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Hu, Y. and Scott, J. A. 2003. 5HSL_MC735: A fast multilevel Fiedler and profile reduction code. RAL-TR-2003-36, Numerical Analysis Group, Computational Science and Engineering Department, Rutherford Appleton Laboratory.Google ScholarGoogle Scholar
  34. Kamper, L., Bozkurt, A., Rybacki, K., Geissler, A., Gerken, I., Stephan, K. E., and Kötter, R. 2002. An introduction to CoCoMac-Online. The online-interface of the primate connectivity database CoCoMac. In Neuroscience Databases---A Practical Guide, R. Kötter, Ed. Kluwer Academic, Norwell, MA, 155--169.Google ScholarGoogle Scholar
  35. Kauffman, S. A. 1969. Metabolic stability and epigenesis in randomly constructed genetic nets. J. Theor. Biol. 22, 437--467.Google ScholarGoogle ScholarCross RefCross Ref
  36. Khanin, R. and Wit, E. 2006. How scale-free are gene networks? J. Comput. Biol. 13, 3, 810--818.Google ScholarGoogle ScholarCross RefCross Ref
  37. Kiss, I. Z., Green, D. M., and Kao, R. R. 2006. The network of sheep movements within Great Britain: Network properties and their implications for infectious disease spread. J. Roy. Soc. Interface 3, 669--677.Google ScholarGoogle ScholarCross RefCross Ref
  38. Kleinberg, J. M. 2000. Navigation in a small world. Nature 406, 845.Google ScholarGoogle ScholarCross RefCross Ref
  39. Langville, A. N. and Meyer, C. D. 2006. Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, Princeton, NJ. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Lovász, L. 1996. Random walks on graphs: A survey. In Paul Erdös is Eighty, D. Miklós, V. T. Sós, and T. Szönyi, Eds. János Bolyai Mathematical Society, Budapest, 353--398.Google ScholarGoogle Scholar
  41. Mangan, S. and Alon, U. 2003. Structure and function of the feed-forward loop network motif. Proc. Nat. Acad. Sci. 100, 11980--11985.Google ScholarGoogle ScholarCross RefCross Ref
  42. Mangan, S., Zaslaver, A., and Alon, U. 2003. The coherent feedforward loop serves as a sign-sensitive delay element in transcription networks. J. Math. Biol. 334, 2, 197--204.Google ScholarGoogle Scholar
  43. Milenkovic, T., Lai, J., and Przulj, N. 2008. GraphCrunch: A tool for large network analyses. BMC Bioinf. 9, 70.Google ScholarGoogle ScholarCross RefCross Ref
  44. Milgram, S. 1967. The small world problem. Psychol. Today 2, 60--67.Google ScholarGoogle Scholar
  45. Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M., and Alon, U. 2004. Superfamilies of evolved and designed networks. Sci. 303, 1538--1542.Google ScholarGoogle ScholarCross RefCross Ref
  46. Morrison, J. L., Breitling, R., Higham, D. J., and Gilbert, D. R. 2005. Generank: Using search engine technology for the analysis of microarray experiments. BMC Bioinf. 6, 233.Google ScholarGoogle ScholarCross RefCross Ref
  47. Morrison, J. L., Breitling, R., Higham, D. J., and Gilbert, D. R. 2006. A lock-and-key model for protein-protein interactions. Bioinf. 2, 2012--2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Newman, M. E. J. 2004. Who is the best connected scientist? A study of scientific coauthorship networks. In Complex Networks, E. Ben-Naim et al., Eds. Springer, 337--370.Google ScholarGoogle Scholar
  49. Newman, M. E. J., Moore, C., and Watts, D. J. 2000. Mean-Field solution of the small-world network model. Phys. Rev. Lett. 84, 3201--3204.Google ScholarGoogle ScholarCross RefCross Ref
  50. Norris, J. R. 1997. Markov Chains. Cambridge University Press.Google ScholarGoogle Scholar
  51. Onody, R. N. and de Castro, P. A. 2004. Complex network study of Brazilian soccer players. Phys. Rev. E 70.Google ScholarGoogle ScholarCross RefCross Ref
  52. Page, L., Brin, S., Motwani, R., and Winograd, T. 1998. The PageRank citation ranking: Bringing order to the Web. Tech. rep., Stanford Digital Library Technologies Project.Google ScholarGoogle Scholar
  53. Penrose, M. 2003. Geometric Random Graphs. Oxford Univeristy Press.Google ScholarGoogle Scholar
  54. Porter, M. A., Mucha, P. J., Newman, M. E. J., and Warmbrand, C. M. 2005. A network analysis of committees in the United States House of Representatives. Proc. Nat. Acad. Sci. 102, 7057--7062.Google ScholarGoogle ScholarCross RefCross Ref
  55. Pržulj, N., Corneil, D. G., and Jurisica, I. 2004. Modeling interactome: Scale-Free or geometric? Bioinf. 20, 18, 3508--3515. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Pržulj, N., Corneil, D. G., and Jurisica, I. 2006. Efficient estimation of graphlet frequency distributions in protein-protein interaction networks. Bioinf. 22, 974--980. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Pržulj, N. and Higham, D. J. 2006. Modeling protein-protein interaction networks via a stickiness index. J. Roy. Soc. Interface 3, 711--716.Google ScholarGoogle ScholarCross RefCross Ref
  58. Salathé, M., May, R. M., and Bonhoeffer, S. 2005. The evolution of network topology by selective removal. J. Roy. Soc. Interface 2, 533--536.Google ScholarGoogle ScholarCross RefCross Ref
  59. Sporns, O. and Zwi, J. D. 2004. The small world of the cerebral cortex. Neuroinf. 2, 145--162.Google ScholarGoogle ScholarCross RefCross Ref
  60. Stumpf, M. P. H., Wiuf, C., and May, R. M. 2005. Subnets of scale-free networks are not scale-free: Sampling properties of networks. Proc. Nat. Acad. Sci. 102, 4221--4224.Google ScholarGoogle ScholarCross RefCross Ref
  61. Thomas, A., Cannings, R., Monk, N. A. M., and Cannings, C. 2003. On the structure of protein-protein interaction networks. Biochem. Soc. Trans. 31, 1491--1496.Google ScholarGoogle ScholarCross RefCross Ref
  62. Titz, B., Schlesner, M., and Uetz, P. 2004. What do we learn from high-throughput protein interaction data? Expert Rev. Proteomics 1, 111--121.Google ScholarGoogle ScholarCross RefCross Ref
  63. Watts, D. J. and Strogatz, S. H. 1998. Collective dynamics of ‘small-world’ networks. Nature 393, 440--442.Google ScholarGoogle ScholarCross RefCross Ref
  64. Williams, R. J., Berlow, E. L., Dunne, J. A., Barabási, A.-L., and Martinez, N. D. 2002. Two degrees of separation in complex food webs. Proc. Nat. Acad. Sci. 99, 12913--12916.Google ScholarGoogle ScholarCross RefCross Ref
  65. Xenarios, I., Salwinski, L., Duan, X. J., Higney, P., Kim, S. M., and D., E. 2002. DIP, the database of interacting proteins: A research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 1, 303--305.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. CONTEST: A Controllable Test Matrix Toolbox for MATLAB

        Recommendations

        Reviews

        Zahari Zlatev

        CONTEST is a toolbox for MATLAB that generates random networks. This paper describes its development. The toolbox is based on the implementation of nine models. The models produce directed graphs that can also be considered classes of adjacency matrices. It is possible to control, by using a few parameters, the order of the matrices produced and, to a certain degree, some other properties (within the particular model selected among the set of nine models). It is possible to convert the adjacency matrices in real-valued symmetric matrices. The paper briefly explains how undirected graphs can be generated when nonsymmetric matrices are to be produced and used. The first two sections of the paper discuss three introductory issues: the motivation for carrying out this research, short remarks about the notation used, and some results previously obtained in the field, with many references to relevant scientific works. The nine models used in CONTEST are briefly described in Section 3. The authors visualize sparsity patterns of matrices produced by these nine models in Figure 1 (by using matrices of order 100), and they provide concise information related to the application of MATLAB codes that produce directed graphs belonging to the nine models. The codes are in fact MATLAB functions that depend on several parameters (up to five). The first parameter is always the order of the desired matrix. By systematically varying this parameter, it is possible to produce matrices of different orders and to investigate what happens with the performance of a given sparse matrix code when the order of the tested matrices is gradually increased. The applicability of the other parameters for similar systematic investigations is not discussed. Default values for each of the other parameters are given. The whole description is limited to the MATLAB environment, but the code could easily be rewritten in some other programming language, such as Fortran or C++. Section 4 introduces and discusses seven utility functions that can be used to modify a given network according to several criteria. The first parameter is always the matrix that has to be modified. The second parameter-if one exists in the function call-describes the criterion that should be used to modify the matrix. The path length between two arbitrary nodes and the curvature of an arbitrary node can also be calculated. Once again, the description is limited to MATLAB. The computational cost of the minimum degree ordering algorithm-which is, for some strange reason, related to the number of nonzero elements in the Cholesky factor L -is very briefly discussed in Section 5, which merely states that the order of the matrices used in the experiments varies from 50 to 10,000. The names of the two models selected for the experiments are given, but it is not clear how the matrices are created. Questions arise regarding the tests with the Gilbert model (results given in Figure 3): Do the authors create one matrix for each value of n in the 50 to 10,000 range by keeping p fixed, or are the matrices created for selected set of values of n and for different values of p __?__ The numbers of the matrices used in Figures 3 and 4 are not given-they only show that many matrices were used. It should also be mentioned that using sparse matrices of the order ? 10,000 might have been justified 20 or 30 years ago, but nowadays, one solves linear systems with dense matrices of the order of up to 10,000 (or even higher), without any computational difficulties. Therefore, it would have been much more desirable to see significantly larger sparse matrices (of orders greater than one million) in the experiments discussed. It would have been more informative and more relevant, especially for the purposes defined at the beginning of the paper, to select some sparse solver, to run it with matrices of all nine models for a small set of large values of n , to present the computing times, and to discuss the differences. CONTEST is, without doubt, an excellent tool for producing different kinds of networks, but the authors state several times that the purpose of using CONTEST is to test solvers for systems of linear algebraic equations and for eigenvalue problems, when the matrices involved are sparse. This is a much more ambitious task, and some further development will perhaps be needed to solve it more successfully. It is necessary to have a simple tool that will allow the user to create matrices of different orders, to vary the sparsity patterns of the nonzero elements in the matrices, to vary the density of the nonzero elements in the created matrices, and to vary the condition numbers of the tested matrices (which is especially important when iterative methods with or without preconditioning are to be used). In other words, it would be necessary, by using the methodology proposed by the authors, to prepare a function, say, sparse_matrix, such that the call a = sparse_matrix(order,pattern,density,condition) will produce a sparse matrix of given order, sparsity pattern, density of the nonzero elements, and condition number. Moreover, some clear and very simple rules for the use of the four parameters must be given. In CONTEST, it is clear how to systematically vary the order of the matrices-by varying the parameter n in each of the nine models. Figure 1 indicates that the pattern can also be varied to a certain degree by specifying different models. It is also possible to vary the density of the nonzero elements, but some more explicit rules on how to do it would help, as it is not clear how the condition numbers of the generated matrices can be systematically varied. For example, it is preferable to have a larger condition number when the value of the parameter condition in sparse_matrix is increased; in fact, it would be very convenient for the user if similar simple rules were valid for all parameters of sparse_matrix. Online Computing Reviews Service

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Mathematical Software
          ACM Transactions on Mathematical Software  Volume 35, Issue 4
          February 2009
          108 pages
          ISSN:0098-3500
          EISSN:1557-7295
          DOI:10.1145/1462173
          Issue’s Table of Contents

          Copyright © 2009 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 February 2009
          • Accepted: 1 June 2008
          • Revised: 1 May 2008
          • Received: 1 June 2007
          Published in toms Volume 35, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader