Top

Information Systems and e-Business Management

Published in:

01-09-2013 | Original Article

An empirical study of cost-sensitive learning in cultural modeling

Authors: Peng Su, Wenji Mao, Daniel Zeng

Published in: Information Systems and e-Business Management | Issue 3/2013

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Cultural modeling aims at developing behavioral models of groups and analyzing the impact of culture factors on group behavior using computational methods. Machine learning methods and in particular classification, play a central role in such applications. In modeling cultural data, it is expected that standard classifiers yield good performance under the assumption that different classification errors have uniform costs. However, this assumption is often violated in practice. Therefore, the performance of standard classifiers is severely hindered. To handle this problem, this paper empirically studies cost-sensitive learning in cultural modeling. We consider cost factor when building the classifiers, with the aim of minimizing total misclassification costs. We conduct experiments to investigate four typical cost-sensitive learning methods, combine them with six standard classifiers and evaluate their performance under various conditions. Our empirical study verifies the effectiveness of cost-sensitive learning in cultural modeling. Based on the experimental results, we gain a thorough insight into the problem of non-uniform misclassification costs, as well as the selection of cost-sensitive methods, base classifiers and method-classifier pairs for this domain. Furthermore, we propose an improved algorithm which outperforms the best method-classifier pair using the benchmark cultural datasets.

previous article A method to support a reflective derivation of business components from conceptual models

next article Performance of e-invoicing in Spanish firms

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

http://www.cidcm.umd.edu/mar.

Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, Belmon

Chawla N, Bowyer K, Hall L, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

Domingos P (1999) MetaCost: a general method for making classifiers cost-sensitive. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 155–164

Drummond C, Holte RC (2003) C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Working notes of the ICML 2003 workshop on learning from imbalanced data sets

Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the seventeenth international joint conference on artificial intelligence, pp 973–978

Govindarajan M (2007) Text mining technique for data mining application. World Acad Sci Eng Technol 26(104):544–549

Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intelli Data Anal 6(5):203–231

Khuller S, Martinez V, Nau D, Simari G, Sliva A, Subrahmanian VS (2007) Finding most probable worlds of logic programs. In: Proceedings of the first international conference on scalable uncertainty management, pp 45–59

Kohavi R, Wolpert DH (1996) Bias plus variance decomposition for zero-One loss functions. In: Proceedings of the thirteenth international conference on machine learning, pp 275–283

Liu XY, Wu JX, Zhou ZX (2006) Exploratory undersampling for class-imbalance learning. In: Proceedings of the sixth IEEE international conference on data mining, pp 539–550

Maloof MA (2003) Learning when data sets are imbalanced and when costs are unequal and unknown. In: Working notes of the ICML 2003 workshop on learning from imbalanced data sets

Mao WJ, Tuzhilin A, Gratch J (2011) Social and economic computing. IEEE Intell Syst 26(6):19–21CrossRef

Martinez V, Simari GI, Sliva A, Subrahmanian VS (2007) CONVEX: context vectors as a paradigm for learning group behaviors based on similarity. IEEE Intell Syst 23(4):51–57CrossRef

Provost F, Fawcett T, Kohavi R (1998) The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the fifteenth international conference on machine learning, pp 445–453

Sarker RA, Abbass HA, Newton C (2002) Heuristics and optimization for knowledge discovery. Idea Group Inc, Naperville

Su P, Mao W, Zeng D, Li X, Wang FY (2009) Handling class imbalance problem in cultural modeling. In: Proceedings of the 2009 IEEE international conference on intelligence and security informatics, pp 251–256

Subrahmanian VS (2007) Computer science: cultural modeling in real time. Science 317(5844):1509–1510CrossRef

Subrahmanian VS, Albanese M, Martinez MV, Nau D, Reforgiato D, Simari GI, Sliva A, Wilkenfeld J, Udrea O (2007) CARA: a cultural-reasoning architecture. IEEE Intell Syst 22(2):12–16CrossRef

Ting KM (1998) Inducing cost-sensitive trees via instance weighting. In: Proceedings of the second european symposium on principles of data mining and knowledge discovery, pp 139–147

Wang FY (2009) Is culture computable? IEEE Intell Syst 24(2):2–3CrossRef

Wang FY, Carley KM, Zeng D, Mao W (2007) Social computing: from social informatics to social intelligence. IEEE Intell Syst 22(2):79–83CrossRef

Weiss GM (2004) Mining with rarity—problems and solutions: a unifying framework. SIGKDD Explor 6(1):7–19CrossRef

Wolpert D (1992) Stacked generalization. Neural Netw 5(2):241–260CrossRef

Xia F, Yang YW, Zhou L, Li FX, Cai M, Zeng D (2009) A closed-form reduction of multi-class cost-sensitive learning to weighted multi-class learning. Pattern Recogn 42(7):1572–1581CrossRef

Zeng D, Wang FY, Carley KM (2007) Social computing. IEEE Intell Syst 22(5):20–22CrossRef

Zhang J, Mani I (2003) kNN approach to unbalanced data distributions: a case study involving information extraction. In: Proceedings of the ICML’2003 workshop on learning from imbalanced data sets

Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63–77CrossRef

Title: An empirical study of cost-sensitive learning in cultural modeling
Authors: Peng Su
Wenji Mao
Daniel Zeng
Publication date: 01-09-2013
Publisher: Springer Berlin Heidelberg
Published in: Information Systems and e-Business Management / Issue 3/2013
Print ISSN: 1617-9846
Electronic ISSN: 1617-9854
DOI: https://doi.org/10.1007/s10257-012-0198-4

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 3/2013

A method to support a reflective derivation of business components from conceptual models

A mapping model for assessing project effort from requirements

The need for systems development capability in design science research: enabling researcher-systems developer collaboration

A credit assessment mechanism for wireless telecommunication debt collection: an empirical study

Performance of e-invoicing in Spanish firms

Design and design research as contextual practice