Weitere Kapitel dieses Buchs durch Wischen aufrufen
The web in current years has been a big tendency, which helps researchers to make it a source of information and essential in the various fields of a commercial area that represents the e-commerce. Machine learning, a branch of artificial intelligence, plays a vital role in creating a great experience at e-commerce companies. Machine learning techniques follow different efficient ways to extract knowledge from huge amount of data. A reliable products analysis of the performance of any e-commerce company is critical. However, due to its various global infrastructures, many likely products get grouped in a different way. So, the analysis of quality of product always depends on the accuracy of the products classification. The better the classification, the more insights can be generating good category of products. Classification is a fundamental problem in machine learning. The main motto of this study is to compare the performance analysis (basis of accuracy). Different machine learning (supervised) methods are used to classify the products. In this paper we compare different machine learning techniques (Nonlinear and rule-based) to classify the products. These approaches have been tested with data from the Kaggle Otto Group Product Classification dataset. The performances of algorithms are measured in two cases, i.e., dataset before feature selection (before preprocessing) and dataset set after feature selection (after preprocessing) and compared in terms of accuracy. The experimental result shows that the overall performance of nonlinear machine learning (KNN) techniques is better than rule-based (C5.0) techniques. The result shows that among the individual classifiers implemented, k-nearest neighbor is having highest accuracy of around 88%. An extensive study is given to explain the efficiency of different classifiers.
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
Pereira, M., Costa, V.S., Camacho, R., Fonseca, N.A., Simoes, C., Brito, R.M., 2009. Comparative study of classification algorithms using molecular descriptors in toxicological databases. In: Proceedings of the 4 th Brazilian Symposium on Bioinformatics: Advances in Bioinformatics and Computational Biology, BSB’09, Berlin, Heidelberg, Springer-Verlag, pp. 121–132.
Decker, R., Kroll, F., 2007. Classification in marketing research by means of LEM2- generated rules. In: Decker, R., Lenz, H.J. (Eds.), Advances in Data Analysis, Studies in Classification, Data Analysis, and Knowledge Organization. Springer Berlin Heidelberg, pp. 425–432.
Marinakis, Y., Marinaki, M., Dounias, G., Jantzen, J., Bjerregaard, B., 2009. Intelligent and nature inspired optimization methods in medicine: the pap smear cell classification problem. Expert Systems 26 (5), 433–457.
Dehuri, S., Patnaik, S., Ghosh, A., Mall, R., 2008. Application of elitist multi-objective genetic algorithm for classification rule generation. Applied Soft Computing 8 (1), 477–487.
Hamamoto, Y., Uchimura, S., Tomita, S., 1997. A bootstrap technique for nearest neighbor classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (1), 73–79.
Quinlan, J.R., 1993. C4. 5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
Yager, R.R., 2006. An extension of the naive Bayesian classifier. Information Sciences 176 (5), 577–588.
Yung, Y., Shaw, M.J., 1995. Introduction to fuzzy decision tree. Fuzzy Set and Systems 69 (1), 125–139.
Zhang, G.P., 2000. Neural networks for classification: a survey. IEEE Transactions on Systems Man and Cybernetics. Part C: Applications and Reviews 30 (4), 451–462.
T. M. Mitchell. Machine learning. McGraw Hill series in computer science. McGraw-Hill, 1997.
R. Ramakrishnan and J. Gehrke. Database management systems (3. ed.). McGraw-Hill, 2003.
C. D. Manning and H. Schütze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.
Y. Yang and X. Liu. A re-examination of text categorization methods. In SIGIR, pages 42–49, 1999.
D. D. Lewis. Representation and Learning in Information Retrieval. PhD thesis, 1992.
A. McCallum and K. Nigam. A comparison of event models for naive bayes text classification. In AAAI-98 Workshop, 1998.
W. W. Cohen. Fast effective rule induction. In ICML, pages 115–123, 1995.
D. Shen, J.-D. Ruvini, and B. Sarwar. Large-scale item categorization for e-commerce. In CIKM, pages 595–604, 2012.
D. Shen, J. D. Ruvini, M. Somaiya, and N. Sundaresan. Item categorization in the e-commerce domain. In CIKM, pages 1921–1924, 2011.
J. Wang, G. Karypis. BAMBOO: Accelerating Closed Item set Mining by Deeply Pushing the Length Decreasing Support Constraint, SDM’04.
Sajida Perveena, Muhammad Shahbaza, Aziz Guergachib and Karim Keshavjeec. Performance Analysis of Data Mining Classification Techniques to Predict Diabetes. Procedia Computer Science 82 (2016) 115–121.
Kandhasamy, J. P., and S. B. Performance Analysis of Classifier Models to Predict Diabetes Mellitus. Procedia Computer Science. 47, (2015), 45-51.3.
Leo Breiman, Jerome H. Friedman, Richard A. Olshen and Charles J. Stone. Classification and Regression Trees. Wadsworth & Brooks, 1984.
Deepthi S, Aswathy Ravikumar and R. Vikraman Nair Evaluation of Classification Techniques for Arrhythmia Screening of Astronauts. Procedia Technology 24 (2016) 1232–1239.
- Evaluating the Progressive Performance of Machine Learning Techniques on E-commerce Data
Bindu Madhuri Cheekati
Sai Varun Padala
- Springer Singapore