Propositionalization-based relational subgroup discovery with RSD

Železný, Filip; Lavrač, Nada

doi:10.1007/s10994-006-5834-0

Propositionalization-based relational subgroup discovery with RSD

Published: 27 January 2006

Volume 62, pages 33–63, (2006)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Propositionalization-based relational subgroup discovery with RSD

Download PDF

Filip Železný¹ &
Nada Lavrač²

1001 Accesses
71 Citations
Explore all metrics

An Erratum to this article was published on 01 May 2006

Abstract

Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problems (analysis of telephone calls and traffic accident analysis).

References

Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., & Verkamo, A.I. (1996). Fast discovery of association rules. In Advances in knowledge discovery and data mining (pp. 307–328).
Aronis, J., & Provost, J. F. (1994). Efficiently constructing relational features from background knowledge for inductive machine learning. In AAAI-94 Workshop on Knowledge Discovery in Databases. (pp. 347–358).
Aronis, J. M., Provost, F. J., & Buchanan, B. G. (1996). Exploiting background knowledge in automated discovery. In Knowledge discovery and data mining (pp. 355–358).
Bayardo, R. (2002). Editorial: The many roles of constraints in data mining. SIGKDD Explorations, 4(1), i–ii.
Google Scholar
Cestnik, B. (1990). Estimating probabilities: A crucial task in machine learning. In Proceedings of the 9th European Conference on Artificial Intelligence (pp. 147–149) Pitman.
Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proceedings Fifth European Working Session on Learning (pp. 151–163). Berlin, Springer.
Clark, P., & Niblett, T. (1987). Induction in noisy domains. In Progress in Machine Learning (Proceedings of the 2nd European Working Session on Learning) (pp. 11–30). Sigma Press.
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.
Google Scholar
Cohen, W. W. (1995). Fast effective rule induction. In A. Prieditis & S. Russell (Eds.), Proceedings of the 12th International Conference on Machine Learning. Tahoe City, CA (pp. 115–123). Morgan Kaufmann.
Cohen, W. W. & Singer, Y. (1991). Hypothesis-driven constructive induction in AQ17: A method and experiments. In Proceedings of the IJCAI-91 Workshop on Evaluating and Changing Representations in Machine Learning (pp. 13–22).
De Raedt, L., Blockeel, H., Dehaspe, L., & Van Laer, W. (2001). Three companions for data mining in first order logic. In: S. Džeroski and N. Lavrač (Eds.), Relational Data Mining (pp. 105–139). Springer-Verlag.
De Raedt, L., & Dehaspe, L. (1997). Clausal discovery. Machine Learning, 26, 99–146.
Article MATH Google Scholar
Džeroski, S., Cestnik, B., & Petrovski, I. (1993). Using the m-estimate in rule induction. Journal of Computing and Information Technology, 1:1, 37–46.
Google Scholar
Džeroski, S., & Lavrač N. (Eds.) (2001). Relational Data Mining. Berlin: Springer-Verlag.
Google Scholar
Fawcett, T. (2001). Using Rule Sets to Maximize ROC Performance. In Proceedings of the International Conference on Data Mining (pp. 131–138).
Flach, P., & Lachiche, N. (1999). 1BC: A First-Order Bayesian Classifier. In S. Džeroski & P. Flach (Eds.), Proceedings of the 9th International Workshop on Inductive Logic Programming (pp. 92–103). Springer-Verlag.
Flach, P., Mladenić, D. Moyle, Raeymaekers S., Rauch J., Rawles S., Ribeiro R., Sclep G., Struyf J., Todorovski L., Torgo H. B. L., Wettschereck D., Wu S., Gartner T., Grobelnik M., Kavšek B., Kejkula M., Krzywania D., Lavrač N., & Ljubič P. (2003). On the road to knowledge: Mining 21 years of UK Tra^**c Accedents Reports. In: D. Mladenić, N. Lavrač, M. Bohanec, & S. Moyle (Eds.), Data Mining and Decision Support: Integration and Collaboration (pp.143–156). Kluwer.
Gamberger, D., & Lavrač, N. (2002). Expert guided subgroup discovery: Methodology and application. Journal of Artificial Intelligence Research, 17, 501–527.
MATH Google Scholar
Garofalakis, M., & Rastogi, R. (2000). Scalable data mining with model constraints. SIKDD Explorations 2:2, 39–48.
Google Scholar
Geibel, P., & Wysotzki, F. (1996). Learning relational concepts with decision trees. In L. Saitta (Ed.), Proceedings of the 13th International Conference on Machine Learning (pp. 166–174). Morgan Kaufmann.
Imielinsky, T., & Mannila, H. (1996). A database perspective on knowledge discovery. Communications of the ACM, 39:11, 58–64.
Article Google Scholar
Kavšek, B., & Lavrač (2004). Analysis of example weighting in subgroup discoveryby comparison of three algorithms on a real-life data set. In J. Fuernkranz (Ed.), Proceedings of the ECML/PKDD Workshop on Advances in Inductive Rule Learning (pp. 64–76).
Kloesgen, W. (1996). EXPLORA: A multipattern and multistrategy discovery assistant. In Advances in Knowledge Discovery and Data Mining. (pp. 249–271). Menlo Park, CA: AAAI Press.
Google Scholar
Kloesgen, W., & May, M. (2002). Census Data Mining—An Application. In Procs. 6th European Conference on Principles and Practice of Knowlede Discovery in Databases.
Koller, D., & Sahami, M. (1996). Toward optimal feature selection. In Proceedings of the International Conference on Machine Learning (pp. 284–292).
Kramer, S., Lavrač, N., & Flach, P. (2001). Propositionalization Approaches to Relational Data Mining. In S. Džeroski & N. Lavrač (Eds.), Relational Data Mining (pp. 262–291). Springer-Verlag.
Kramer, S., Pfahringer, B., & Helma, C. (1998). Stochastic Propositionalizationof Non-determinate Background Knowledge. In D. Page (Ed.), Proceedings of the 8th International Conference on Inductive Logic Programming, Vol. 1446 of Lecture Notes in Artificial Intelligence (pp. 80–94). Springer-Verlag.
Krogel, M.-A., Rawles, S., & Železný, F., Flach, P. A., Lavrač, N., & Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In Proceedings of the 13th International Conference on Inductive Logic Programming. Springer-Verlag.
Lavrač, N., & Džeroski, S. (1994). Inductive Logic Programming: Techniques and Applications. Ellis Horwood.
Lavrač, N. & Flach, P. A. (2001). An extended transformation approach to inductivelogic programming. ACM Transactions on Computational Logic, 2:4, 458–494.
Article Google Scholar
Lavrač, N., Gamberger, D., & Jovanoski, V. (1999). A study of relevance for learningin deductive databases. Journal of Logic Programming, 40:2/3, 215–249.
Article Google Scholar
Lavrač, N., Kavšek, B., Flach, P., & Todorovski, L. (2004). Subgroup Discovery with CN2-SD. Journal of Machine Learning Research, 5, 153–188.
Google Scholar
Mannila, H., & Toivonen, H. (1997). Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1:3, 241–258.
Article Google Scholar
Michie, D., Muggleton, S., Page, D., & Srinivasan, A. (1994). To the international computing community: A new East-West challenge. Technical report, Oxford University Computing Laboratory, Oxford, UK.
Muggleton, S. (1992). Inductive Logic Programming. Academic Press.
Muggleton, S. (1995). Inverse Entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming 13:3–4, 245–286.
Google Scholar
Muggleton, S., Bain, M., Hayes-Michie, J., & Michie, D. (1989). An experimentalcomparison of human and machine learning formalism. In Proceedings of the 6th International Workshop on Machine Learning. (pp. 113–118).
Oliveira, A., & Sangiovanni-Vincentelli, A. (1992). Constructive induction using a non-greedy strategy for feature selection. In Proceedings of the 9th InternationalWorkshop on Machine Learning.
Pagallo, G., & Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning, 5:1, 71–99.
Article Google Scholar
Provost, F. J., & Fawcett, T. (1998). Robust classification systems for imprecise environments. In Proceedings of the 15th Conference on Artificial Intelligence (pp. 706–713).
Quinlan, J. (1990). Learning logical definitions from Relations. Machine Learning, 5, 239–266.
Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann.
Rivest, R. L. (1987). Learning decision lists. Machine Learning 2:3, 229–246.
Google Scholar
Sebag, M., & Rouveirol, C. (1997). Tractable induction and classification in first-order logic via stochastic matching. In Proceedings of the 15th InternationalJoint Conference on Artificial Intelligence (pp. 888–893). Morgan Kaufmann.
Srinivasan, A., & King, R. (1996). Feature construction with Inductive Logic Programming: A study of quantitative predictions of biological activity aided bystructural attributes. In Proceedings of the 6th International Workshop on Inductive Logic Programming. (pp. 89–104). Springer-Verlag.
Srinivasan, A., Muggleton, S. H., Sternberg, M. J. E., & King, R. D. (1996). Theories for mutagenicity: A study in first-order and feature-based induction. Artificial Intelligence, 84, 277–299.
Article Google Scholar
Stahl, I. (1996). Predicate invention in inductive logic programming. In L. De Raedt (Ed.), Advances in Inductive Logic Programming. IOS Press (pp. 34–47).
Suzuki, E. (2004). Discovering interesting exception rules with rule pair. In J. Fuernkranz (Ed.), Proceedings of the ECML/PKDD Workshop on Advances in Inductive Rule Learning (pp. 163–178).
Turney, P. (1996). Low size-complexity inductive logic programming: the east-west challenge considered as a problem in cost-sensitive classification. In L. De Raedt (Ed.), Advances in Inductive Logic Programming. IOS Press (pp. 308–321).
Witten, I. H., & Frank, E. (1999). Data Mining: Practical Machine Learning Toolsand Techniques with Java Implementations. Morgan Kaufmann.
Witten, I. H., Frank, E., Trigg, L., Hall, M., Holmes, G., & Cunningxham, S. J. (1999). Weka: Practical Machine Learning Tools and Techniques with Java Implementations.
Wrobel, S. (1997). An algorithm for multi-relational discovery of subgroups. In J.Komorowski & J. Zytkow (Eds.), Proceedings of the First European Symposion on Principles of Data Mining and Knowledge Discovery (PKDD-97) (pp. 78–87). Berlin, Springer Verlag.
Wrobel, S. (2001). Inductive logic programming for knowledge discovery indatabases. In S. Džeroski & N. Lavrač (Eds.), Relational Data Mining. (pp. 74–101) Springer-Verlag.
Wrobel, S., & Džeroski, S. (1995). The ILP description learning problem: Towardsa general model-level definition of data mining in ILP. In K. Morik & J. Herrmann (Eds.), Proceedings of the Fachgruppentreffen Maschinelles Lernen(FGML-95). 44221 Dortmund, Univ. Dortmund.
Železný, F., Mikšovský, P., Štepánková, O., & Zídek, J. (2000). ILP for automated telephony. In J. Cussens & A. Frisch (Eds.), Proceedings of the Work-in-Progress Track at the 10th International Conference on Inductive Logic Programming (pp. 276–286).
Železný, F., Zídek, J., & Štěpánková, O. (2002). A learning system for decision support in telecommunications. In Proceedings of the 1st International Conference on Computing in an Imperfect World, Belfast 4/2002. Springer-Verlag.
Zucker, J.-D., & Ganascia, J.-G. (1996). Representation changes for efficient learning in structural domains. In L. Saitta (Ed.), Proceedings of the 13th International Conference on Machine Learning (pp. 543–551). Morgan Kaufmann
Zucker, J.-D., & Ganascia, J.-G. (1998). Learning structurally indeterminate clauses. In D. Page (Ed.), Proceedings of the 8th International Conference on Inductive Logic Programming (pp. 235–244). Springer-Verlag.

Download references

Author information

Authors and Affiliations

Czech Technical University, Prague, Czech Republic
Filip Železný
Institute Jožef Stefan, Ljubljana, Slovenia, and Nova Gorica Polytechnic, Nova Gorica, Slovenia
Nada Lavrač

Authors

Filip Železný
View author publications
You can also search for this author in PubMed Google Scholar
Nada Lavrač
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Filip Železný.

Additional information

Editors: Hendrik Blockeel, David Jensen and Stefan Kramer

An erratum to this article is available at http://dx.doi.org/10.1007/s10994-006-8633-8.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Železný, F., Lavrač, N. Propositionalization-based relational subgroup discovery with RSD. Mach Learn 62, 33–63 (2006). https://doi.org/10.1007/s10994-006-5834-0

Download citation

Received: 24 February 2003
Revised: 01 December 2004
Accepted: 27 July 2005
Published: 27 January 2006
Issue Date: February 2006
DOI: https://doi.org/10.1007/s10994-006-5834-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Propositionalization-based relational subgroup discovery with RSD

Abstract

Article PDF

Similar content being viewed by others

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Emerging trends in federated learning: from model fusion to federated X learning

Uncertainty in big data analytics: survey, opportunities, and challenges

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Propositionalization-based relational subgroup discovery with RSD

Abstract

Article PDF

Similar content being viewed by others

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Emerging trends in federated learning: from model fusion to federated X learning

Uncertainty in big data analytics: survey, opportunities, and challenges

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation