article

A critical review of multi-objective optimization in data mining: a position paper

Author:
Alex A. Freitas

University of Kent, Canterbury, UK

University of Kent, Canterbury, UK
View Profile

Authors Info & Claims

ACM SIGKDD Explorations Newsletter Volume 6 Issue 2December 2004pp 77–86https://doi.org/10.1145/1046456.1046467

Published:01 December 2004Publication History

ACM SIGKDD Explorations Newsletter

Abstract

This paper addresses the problem of how to evaluate the quality of a model built from the data in a multi-objective optimization scenario, where two or more quality criteria must be simultaneously optimized. A typical example is a scenario where one wants to maximize both the accuracy and the simplicity of a classification model or a candidate attribute subset in attribute selection. One reviews three very different approaches to cope with this problem, namely: (a) transforming the original multi-objective problem into a single-objective problem by using a weighted formula; (b) the lexicographical approach, where the objectives are ranked in order of priority; and (c) the Pareto approach, which consists of finding as many non-dominated solutions as possible and returning the set of non-dominated solutions to the user. One also presents a critical review of the case for and against each of these approaches. The general conclusions are that the weighted formula approach -- which is by far the most used in the data mining literature -- is to a large extent an ad-hoc approach for multi-objective optimization, whereas the lexicographic and the Pareto approach are more principled approaches, and therefore deserve more attention from the data mining community.

References

{Baeza-Yates & Ribeiro-Neto 1999} R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999. Google ScholarDigital Library
{Bhattacharyya 2000} S. Bhattacharyya. Evolutionary algorithms in data mining: multi-objective performance modelling for direct marketing. Proc. 6th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD-2000), 465--473. ACM, 2000. Google ScholarDigital Library
{Bradley 1997} A. P. Bradley. The use of the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145--1159. 1997. Google ScholarDigital Library
{Brachman & Anand 1996} R. J. Brachman and T. Anand. The process of knowledge discovery in databases: a human-centered approach. In: U. M. Fayyad et al (Eds.) Advances in Knowledge Discovery and Data Mining, 37--58. AAAI/MIT, 1996. Google ScholarDigital Library
{Bruha & Tkadlec 2003} I. Bruha and J. Tkadlec. Rule quality for multiple-rule classifier: empirical expertise and theoretical methodology. Intelligent Data Analysis 7(2), 2003, 99--124. Google ScholarDigital Library
{Cherkauer & Shavlik 1996} K. J. Cherkauer and J. W. Shavlik. Growing simpler decision trees to facilitate knowledge discovery. Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining (KDD-96), 315--318. AAAI Press, 1996.Google Scholar
{Chisholm & Tadepalli 2002} M. Chisholm and P. Tadepalli. Learning decision rules by randomised iterative local search. Proc. 19th Int. Conf. on Machine Learning (ICML-2002), 75--82. Morgan Kaufmann, 2002. Google ScholarDigital Library
{Corne et al. 2003} D. Corne, K. Deb and P. J. Fleming. The good of the many outweighs the good of the one: evolutionary multi-objective optimization. IEEE Connections Newsletter 1(1), 9-13. IEEE Neural Networks Society, Feb. 2003.Google Scholar
{Deb 2001} K. Deb. Multi-Objective Optimization Using Evolutionary Algorithms. John Wiley & Sons, 2001. Google ScholarDigital Library
{Fayyad & Irani 1993} U. M. Fayyad and K. B. Irani. Multi-interval discretization of continuous-valued attributges for classification learning. Proc. 13th Int. Joint Conf. on Artificial Intelligence (IJCAI'93), 1022--1027. 1993.Google Scholar
{Fayyad et al. 1996} U. M. Fayyad, G. Piatetsky-Shapiro and P. Smyth. From data mining to knowledge discovery: an overview. In: U. M. Fayyad et al (Eds.) Advances in Knowledge Discovery and Data Mining, 1--34. AAAI/MIT, 1996. Google ScholarDigital Library
{Freitas 1998} A. A. Freitas. On objective measures of rule surprisingness. Principles of Data Mining & Knowledge Discovery (Proc. 2nd European Symp., PKDD'98). LNAI 1510, 1--9. Springer-Verlag, 1998 Google ScholarDigital Library
{Furnkranz & Flach 2003} J. Furnkranz and P. A. Flach. An analysis of rule evaluation metrics. Proc. 20th Int. Conf. on Machine Learning (ICML-2003). Morgan Kaufmann, 2003.Google Scholar
{Guyon & Elisseeff 2003} I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research 3 (2003), 1157--1182, 2003. Google ScholarDigital Library
{Kaufmann & Michalski 1999} K. A. Kaufmann and R. S. Michalski. Learning from inconsistent and noisy data: the AQ18 approach. Foundations of Intelligent Systems (Proc. ISMIS-99). LNAI 1609, 411--419. Springer, 1999. Google ScholarDigital Library
{Kim et al. 2000} Y. Kim, W. N. Street and F. Menczer. Feature selection in unsupervised learning via evolutionary search. Proc. 6th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD-2000), 365--369. ACM, 2000. Google ScholarDigital Library
{Kohavi & John 1998} R. Kohavi and G. H. John. The wrapper approach. In: H. Liu and H. Motoda (Eds.) Feature Extraction, Construction and Selection: a data mining perspective, 33--50. Kluwer, 1998.Google Scholar
{Li et al. 2002} J. Li, R. Topor, H. Shen. Construct robust rule sets for classification. Proc. 8th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining (KDD-2002). ACM Press, 2002. Google ScholarDigital Library
{Ling & Zhang 2002} C. L. Ling and H. Zhang. Toward Bayesian classifiers with accurate probabilities. Advances in Knowledge Discovery & Data Mining (Proc. PAKDD-2002), LNAI 2336, 123--134. Springer, 2002. Google ScholarDigital Library
{Liu & Motoda 1998} H. Liu and H. Motoda (Eds.) Feature Extraction, Construction and Selection: a data mining perspective. Kluwer, 1998. Google ScholarDigital Library
{Liu et al. 1997} B. Liu, W. Hsu and S. Chen. Using general impressions to analyze discovered classification rules. Proc. 3rd Intl. Conf. on Knowledge Discovery & Data Mining (KDD-97), 31--36. AAAI Press, 1997.Google Scholar
{Mitchell 1980} T. M. Mitchell. The need for biases in learning generalizations. Rutgers Technical Report, 1980. Published in: J. W. Shavlik and T. G. Dietterich (Eds.) Readings in Machine Learning, 184--191. Morgan Kaufmann, 1990.Google Scholar
{Mitchell 1997} T. M. Mitchell. Machine Learning. McGraw-Hill, 1997. Google ScholarDigital Library
{Pappa et al. 2002} G. L. Pappa, A. A. Freitas and C. A. A. Kaestner. A multiobjective genetic algorithm for attribute selection. Proc. 4th Int. Conf. on Recent Advances in Soft Computing (RASC-2002), 116--121. Published in CD-ROM (ISBN: 1-84233-0764). Nottingham Trent University, UK. Dec. 2002.Google Scholar
{Provost & Fawcett 1997} F. Provost and T. Fawcett. Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions. Proc. 3rd Int. Conf. on Knowledge Discovery and Data Mining (KDD-97), 43--48. AAAI Press, 1997.Google Scholar
{Quinlan & Rivest 1989} J. R. Quinlan and R. L. Rivest. Inferring decision trees using the minimum description length principle. Information and Computation 80, 227--248. 1989. Google ScholarDigital Library
{Ting 2002} K. M. Ting. Issues in classifier evaluation using optimal cost curves. Proc. 19th Int. Conf. on Machine Learning (ICML-2002), 642--649. Morgan Kaufmann, 2002. Google ScholarDigital Library
{Turney 1995} P. D. Turney. Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J. AI Research 2, Mar. 1995, 369--409. Google ScholarDigital Library
{Tuzhilin 2002} A. Tuzhilin. Minimum Description Length. In: W. Klosgen & J. M. Zytkow (Eds.) Handbook of Data Mining and Knowledge Discovery, 490--496. Oxford University Press, 2002. Google ScholarDigital Library
{Weiss & Indurkhya 2000} S. M. Weiss and N. Indurkhya. Lightweight rule induction. Proc. 17th Int. Conf. on Machine Learning (ICML-2000), 1135--1142. Morgan Kaufmann, 2000. Google ScholarDigital Library
{Weiss et al. 2003} S. M. Weiss, S. J. Buckley, S. Kapoor and S. Damgaard. Knowledge-based data mining. Proc. 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD-2003), 456--461. ACM, 2003. Google ScholarDigital Library
{Yu & Liu 2003} L. Yu and H. Liu. Efficiently handling feature redundancy in high-dimensional data. Proc. 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD-2003), 685--690. ACM, 2003. Google ScholarDigital Library

Index Terms

A critical review of multi-objective optimization in data mining: a position paper
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
2. Information systems
  1. Information systems applications
    1. Data mining

Index terms have been assigned to the content through auto-classification.

Recommendations

Inverted PBI in MOEA/D and its impact on the search performance on multi and many-objective optimization
GECCO '14: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation

MOEA/D decomposes a multi-objective optimization problem into a number of single objective optimization problems. Each single objective optimization problem is defined by a scalarizing function using a weight vector. In MOEA/D, there are several ...
Read More
Multi-objective optimization using teaching-learning-based optimization algorithm

Two major goals in multi-objective optimization are to obtain a set of nondominated solutions as closely as possible to the true Pareto front (PF) and maintain a well-distributed solution set along the Pareto front. In this paper, we propose a teaching-...
Read More
Logic optimality for multi-objective optimization

Pareto dominance is one of the most basic concepts in multi-objective optimization. However, it is inefficient when the number of objectives is large because in this case it leads to an unmanageable number of Pareto solutions. In order to solve this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM SIGKDD Explorations Newsletter Volume 6, Issue 2
December 2004
161 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/1046456
Issue’s Table of Contents

Copyright © 2004 Author
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 December 2004
Check for updates
Author Tags
classification
lexicographic approach
multi-objective optimization
pareto dominance
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 137
  Total Citations
  View Citations
- 2,396
  Total Downloads
- Downloads (Last 12 months)141
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A critical review of multi-objective optimization in data mining: a position paper

ACM SIGKDD Explorations Newsletter

Abstract

References

Cited By

Index Terms

Recommendations

Inverted PBI in MOEA/D and its impact on the search performance on multi and many-objective optimization

Multi-objective optimization using teaching-learning-based optimization algorithm

Logic optimality for multi-objective optimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A critical review of multi-objective optimization in data mining: a position paper

ACM SIGKDD Explorations Newsletter

Abstract

References

Cited By

Index Terms

Recommendations

Inverted PBI in MOEA/D and its impact on the search performance on multi and many-objective optimization

Multi-objective optimization using teaching-learning-based optimization algorithm

Logic optimality for multi-objective optimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media