skip to main content
article

Comparison of graph-based and logic-based multi-relational data mining

Published:01 December 2005Publication History
Skip Abstract Section

Abstract

We perform an experimental comparison of the graph-based multi-relational data mining system, Subdue, and the inductive logic programming system, CProgol, on the Mutagenesis dataset and various artificially generated Bongard problems. Experimental results indicate that Subdue can significantly outperform CProgol while discovering structurally large multi-relational concepts. It is also observed that CProgol is better at learning semantically complicated concepts and it tends to use background knowledge more effectively than Subdue. An analysis of the results indicates that the differences in the performance of the systems are a result of the difference in the expressiveness of the logic-based and the graph-based representations. The ability of graph-based systems to learn structurally large concepts comes from the use of a weaker representation whose expressiveness is intermediate between propositional and first-order logic. The use of this weaker representation is advantageous while learning structurally large concepts but it limits the learning of semantically complicated concepts and the utilization background knowledge.

References

  1. C. Anglano, A. Giordana, G. L. Bello, and L. Saitta. An experimental evaluation of coevolutive concept learning. In ICML, pages 19--27, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. H. Blockeel and L. D. Raedt. Top-down induction of first-order logical decision trees. Artif. Intell., 101(1-2):285--297, 1998.]]Google ScholarGoogle ScholarCross RefCross Ref
  3. M. Bongard. Pattern Recognition. Spartan Books, 1970.]]Google ScholarGoogle Scholar
  4. M. Botta and A. Giordana. Smart+: A multi-strategy learning tool. In IJCAI, pages 937--945, 1993.]]Google ScholarGoogle Scholar
  5. D. J. Cook and L. B. Holder. Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. (JAIR), 1:231--255, 1994.]]Google ScholarGoogle Scholar
  6. L. Dehaspe and H. Toivonen. Discovery of frequent datalog patterns. Data Min. Knowl. Discov., 3(1):7--36, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Dzeroski. Multi-relational data mining: an introduction. SIGKDD Explorations, 5(1):1--16, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Getoor. Link mining: a new data mining challenge. SIGKDD Explorations, 5(1):84--89, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraphs in the presence of isomorphism. In ICDM, pages 549--552, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Imielinski and H. Mannila. A database perspective on knowledge discovery. Commun. ACM, 39(11):58--64, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD, pages 13--23, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. I. Kondor and J. D. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In ICML, pages 315--322, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Kuramochi and G. Karypis. Discovering frequent geometric subgraphs. In ICDM, pages 258--265, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Kuramochi and G. Karypis. An efficient algorithm for discovering frequent subgraphs. IEEE Trans. Knowl. Data Eng., 16(9):1038--1051, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Matsuda, T. Horiuchi, H. Motoda, and T. Washio. Extension of graph-based induction for general graph structured data. In PAKDD, pages 420--431, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Muggleton. Inductive logic programming. New Generation Comput., 8(4):295--, 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Muggleton. Inverse entailment and progol. New Generation Comput., 13(3&4):245--286, 1995.]]Google ScholarGoogle ScholarCross RefCross Ref
  18. S. Muggleton and C. Feng. Efficient induction of logic programs. In ALT, pages 368--381, 1990.]]Google ScholarGoogle Scholar
  19. S. Nijssen and J. N. Kok. A quickstart in frequent structure mining can make a difference. In KDD, pages 647--652, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. R. Quinlan. Learning logical definitions from relations. Machine Learning, 5:239--266, 1990.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. L. D. Raedt and W. V. Laer. Inductive constraint logic. In ALT, pages 80--94, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Rissanen. Sochastic Complexity in Statistical Inquiry. Singapore: World Scientific Publishing, 1989.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Srinivasan, S. Muggleton, M. J. E. Sternberg, and R. D. King. Theories for mutagenicity: A study in first-order and feature-based induction. Artif. Intell., 85(1-2):277--299, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, pages 721--724, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. X. Yan and J. Han. Closegraph: mining closed frequent graph patterns. In KDD, pages 286--295, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. J. M. Zelle, R. J. Mooney, and J. B. Konvisser. Combining top-down and bottom-up techniques in inductive logic programming. In ICML, pages 343--351, 1994.]]Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Comparison of graph-based and logic-based multi-relational data mining

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGKDD Explorations Newsletter
          ACM SIGKDD Explorations Newsletter  Volume 7, Issue 2
          December 2005
          152 pages
          ISSN:1931-0145
          EISSN:1931-0153
          DOI:10.1145/1117454
          Issue’s Table of Contents

          Copyright © 2005 Authors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 December 2005

          Check for updates

          Qualifiers

          • article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader