Abstract
We perform an experimental comparison of the graph-based multi-relational data mining system, Subdue, and the inductive logic programming system, CProgol, on the Mutagenesis dataset and various artificially generated Bongard problems. Experimental results indicate that Subdue can significantly outperform CProgol while discovering structurally large multi-relational concepts. It is also observed that CProgol is better at learning semantically complicated concepts and it tends to use background knowledge more effectively than Subdue. An analysis of the results indicates that the differences in the performance of the systems are a result of the difference in the expressiveness of the logic-based and the graph-based representations. The ability of graph-based systems to learn structurally large concepts comes from the use of a weaker representation whose expressiveness is intermediate between propositional and first-order logic. The use of this weaker representation is advantageous while learning structurally large concepts but it limits the learning of semantically complicated concepts and the utilization background knowledge.
- C. Anglano, A. Giordana, G. L. Bello, and L. Saitta. An experimental evaluation of coevolutive concept learning. In ICML, pages 19--27, 1998.]] Google ScholarDigital Library
- H. Blockeel and L. D. Raedt. Top-down induction of first-order logical decision trees. Artif. Intell., 101(1-2):285--297, 1998.]]Google ScholarCross Ref
- M. Bongard. Pattern Recognition. Spartan Books, 1970.]]Google Scholar
- M. Botta and A. Giordana. Smart+: A multi-strategy learning tool. In IJCAI, pages 937--945, 1993.]]Google Scholar
- D. J. Cook and L. B. Holder. Substructure discovery using minimum description length and background knowledge. J. Artif. Intell. Res. (JAIR), 1:231--255, 1994.]]Google Scholar
- L. Dehaspe and H. Toivonen. Discovery of frequent datalog patterns. Data Min. Knowl. Discov., 3(1):7--36, 1999.]] Google ScholarDigital Library
- S. Dzeroski. Multi-relational data mining: an introduction. SIGKDD Explorations, 5(1):1--16, 2003.]] Google ScholarDigital Library
- L. Getoor. Link mining: a new data mining challenge. SIGKDD Explorations, 5(1):84--89, 2003.]] Google ScholarDigital Library
- J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraphs in the presence of isomorphism. In ICDM, pages 549--552, 2003.]] Google ScholarDigital Library
- T. Imielinski and H. Mannila. A database perspective on knowledge discovery. Commun. ACM, 39(11):58--64, 1996.]] Google ScholarDigital Library
- A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In PKDD, pages 13--23, 2000.]] Google ScholarDigital Library
- R. I. Kondor and J. D. Lafferty. Diffusion kernels on graphs and other discrete input spaces. In ICML, pages 315--322, 2002.]] Google ScholarDigital Library
- M. Kuramochi and G. Karypis. Discovering frequent geometric subgraphs. In ICDM, pages 258--265, 2002.]] Google ScholarDigital Library
- M. Kuramochi and G. Karypis. An efficient algorithm for discovering frequent subgraphs. IEEE Trans. Knowl. Data Eng., 16(9):1038--1051, 2004.]] Google ScholarDigital Library
- T. Matsuda, T. Horiuchi, H. Motoda, and T. Washio. Extension of graph-based induction for general graph structured data. In PAKDD, pages 420--431, 2000.]] Google ScholarDigital Library
- S. Muggleton. Inductive logic programming. New Generation Comput., 8(4):295--, 1991.]] Google ScholarDigital Library
- S. Muggleton. Inverse entailment and progol. New Generation Comput., 13(3&4):245--286, 1995.]]Google ScholarCross Ref
- S. Muggleton and C. Feng. Efficient induction of logic programs. In ALT, pages 368--381, 1990.]]Google Scholar
- S. Nijssen and J. N. Kok. A quickstart in frequent structure mining can make a difference. In KDD, pages 647--652, 2004.]] Google ScholarDigital Library
- J. R. Quinlan. Learning logical definitions from relations. Machine Learning, 5:239--266, 1990.]] Google ScholarDigital Library
- L. D. Raedt and W. V. Laer. Inductive constraint logic. In ALT, pages 80--94, 1995.]] Google ScholarDigital Library
- J. Rissanen. Sochastic Complexity in Statistical Inquiry. Singapore: World Scientific Publishing, 1989.]] Google ScholarDigital Library
- A. Srinivasan, S. Muggleton, M. J. E. Sternberg, and R. D. King. Theories for mutagenicity: A study in first-order and feature-based induction. Artif. Intell., 85(1-2):277--299, 1996.]] Google ScholarDigital Library
- X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In ICDM, pages 721--724, 2002.]] Google ScholarDigital Library
- X. Yan and J. Han. Closegraph: mining closed frequent graph patterns. In KDD, pages 286--295, 2003.]] Google ScholarDigital Library
- J. M. Zelle, R. J. Mooney, and J. B. Konvisser. Combining top-down and bottom-up techniques in inductive logic programming. In ICML, pages 343--351, 1994.]]Google ScholarCross Ref
Index Terms
- Comparison of graph-based and logic-based multi-relational data mining
Recommendations
Qualitative comparison of graph-based and logic-based multi-relational data mining: a case study
MRDM '05: Proceedings of the 4th international workshop on Multi-relational miningThe goal of this paper is to generate insights about the differences between graph-based and logic-based approaches to multi-relational data mining by performing a case study of graph-based system, Subdue and the inductive logic programming system, ...
Multi-relational Data Mining: a perspective
EPIA '01: Proceedings of the10th Portuguese Conference on Artificial Intelligence on Progress in Artificial Intelligence, Knowledge Extraction, Multi-agent Systems, Logic Programming and Constraint SolvingMulti-relational data mining (MRDM) is a form of data mining operating on data stored in multiple database tables. While machine learning and data mining are traditionally concerned with learning from single tables, MRDM is required in domains where the ...
Multi-Relational Data Mining Based on Higher-Order Inductive Logic Programming
GCIS '09: Proceedings of the 2009 WRI Global Congress on Intelligent Systems - Volume 02This paper presents a novel multi-relational data mining (MRDM) approach from a perspective of considering higher-order inductive logic programming to dealing with the representation formalism problems of existing multi-relational data mining ...
Comments