Top

2022 | Book

Read chapter Read first chapter

Machine Learning and Data Analytics for Solving Business Problems

Methods, Applications, and Case Studies

Editors: Bader Alyoubi, Chiheb-Eddine Ben Ncir, Ibraheem Alharbi, Anis Jarboui

Publisher: Springer International Publishing

Book Series : Unsupervised and Semi-Supervised Learning

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book presents advances in business computing and data analytics by discussing recent and innovative machine learning methods that have been designed to support decision-making processes. These methods form the theoretical foundations of intelligent management systems, which allows for companies to understand the market environment, to improve the analysis of customer needs, to propose creative personalization of contents, and to design more effective business strategies, products, and services. This book gives an overview of recent methods – such as blockchain, big data, artificial intelligence, and cloud computing – so readers can rapidly explore them and their applications to solve common business challenges. The book aims to empower readers to leverage and develop creative supervised and unsupervised methods to solve business decision-making problems.

Frontmatter

Chapter 1. Predicting Salaries with Random-Forest Regression

Abstract

For companies it is essential to know the market price of the salaries of their current and prospective employees. Predicting such salaries is challenging, as many factors need to be considered, and large real datasets for learning are scarce. For this reason, research on salary predictions is comparably rare and limited. In this study, we investigate whether and how an advanced machine-learning approach, namely ensembles of random-forest regression, can achieve high-quality salary predictions. We use a large real dataset of more than three million employees and more than 300 professions. Our approach learns –for each profession– a random-forest regression model to predict salaries. In our evaluation, we show that this approach performs better than related work on salary prediction by machine-learning approaches with a mean absolute percentage error (MAPE) of 17.1%. We identify reducing the number of possible values of categorical variables, training separate models as well as outlier handling as the key factors for the results achieved.

Frank Eichinger, Moritz Mayer

Chapter 2. Data-Driven Analysis of Microfinance and Social Loans Before and During the COVID-19 Pandemic Using Exploratory Analysis and Decision Tree Classifiers

Abstract

Social banking and microfinance have become the key factors for supporting low-income families and underprivileged citizens as well as developing and sponsoring microenterprises. This role has become more important, especially with the widespread of COVID-19 pandemic and its economic repercussions on individuals and microenterprises. We propose in this chapter a data-driven study on the repercussions of the COVID-19 pandemic on the beneficiaries’ characteristics of social and microloans. Our analysis is based on a real case study of Saudi Social Development Bank (SDB) by analyzing microfinance loans granted by this important social financing bank for the years 2019 and 2020. Our study mainly investigates the changes in demographic, social, and economic characteristics of the beneficiaries using both a bivariate exploratory analysis and a decision tree classifier. A machine learning decision tree classification model has been built, for individual and business microcredits, to easily visualize the main beneficiary characteristics of each credit category before and during the COVID-19 pandemic. The built decision trees allowed to deeply understand the main characteristics of each credit category which will help managers to design more suitable and fitted microfinance products.

Chiheb-Eddine Ben Ncir, Bader Alyoubi, Roaa Alrazyeg

Chapter 3. Identification of Credit Risks Using Cluster Analysis and Behavioural Scoring During the COVID-19 Pandemic

Abstract

The COVID-19 crisis had a significant impact on the banking sector especially on the level of customers’ ability to redeem their credit. This situation requires the establishment of new policies and strategies to assess risk levels. However, good strategies can only be developed if banks are able to differentiate between viable, non-viable, and viable but distressed debtors at a granular level, grouping borrowers with similar characteristics and resolving them comparably. To address these issues, we propose in this chapter a new behavioural scoring approach that allows identifying the different risk levels for existing credit accounts and finding the principal factors affecting credit risk before and during the pandemic. The proposed approach mixes k-means cluster analysis and supervised learning with decision tree. Interesting clusters of credit account behaviours are obtained, in addition to new rules describing the essential qualities of credit applicants before and during the pandemic.

Waad Bouaguel, Taghrid Al silimani

Chapter 4. Improving Sales Prediction for Point-of-Sale Retail Using Machine Learning and Clustering

Abstract

Point-of-sale retail represents an important aspect of daily consumer purchases. Even with the increasing growth of online retailing, physical retail stores provide useful services for consumers. Data analytics can be applied to improve the performance of this type of retailing by better predicting product sales and optimizing product availability. Large physical retail chains sell a wide range of products in different store locations which makes high-quality predictions across different products, categories, and store locations complex and often results in low-quality sales forecasts. Developing a data analytics model for every single product and store in a retail chain would be difficult to scale. Against this background, machine learning methods are highly promising and could be used to cluster stores with similar properties to subsequently provide a single model for predicting their specific product sales. Yet, literature that provides a systematic approach for clustering stores based on a standardized list of properties is limited. This paper addresses this gap by identifying the main factors for clustering retail stores and examines model combinations of clustering and prediction algorithms that improve sales forecasts in retail stores. The results of this paper show selected factors for organizing stores and present the best performing algorithms for predicting product sales.

Chibuzor Udokwu, Patrick Brandtner, Farzaneh Darbanian, Taha Falatouri

Chapter 5. Telecom Customer Segmentation Using Deep Embedded Clustering Algorithm

Abstract

Telecom companies record customer’s actions, which generates a large amount of data that can lead to crucial insights on customer behaviour and demands. Most telecom companies use customer segmentation to increase customer satisfaction, which entails dividing targeted customers into different groups based on demographics or usage perspectives such as gender, age group, buying behaviour, usage pattern, special interests, and other characteristics that represent the customer. With more number of attributes and great sparsity of telecom data, identifying targeted customers become difficult and various machine learning algorithms have been proposed for the same. Deep learning has gained huge popularity in various business analytics and operations. However, use of deep learning for customer segmentation is very limited. This paper aims to segment Telecom customer data using deep embedded clustering algorithm. For experimental purpose, Kaggle’s telco customer churn dataset is considered. Results of our study indicate that deep embedded clustering algorithm is able to attain better segmentation results as compared to traditional clustering algorithms such as K-means and Hierarchical clustering approaches.

R. Jothi, K. Muthukumaran

Chapter 6. Semantic Image Quality Assessment Using Conventional Neural Network for E-Commerce Catalogue Management

Abstract

Managing catalogues is an important challenge in the field of electronic commerce because it can significantly help visitors efficiently select the items that interest them. In retail websites, all the items included in the catalogue are displayed in a particular order and classified into different categories. This manual grouping and ordering take plenty of time. Furthermore, the evaluation of image quality plays a very important role in managing the catalogue [5]. In fact, the quality of the images sent by the supplier is not always considered as having good quality which may lead to client dissatisfaction. To deal with these issues, we propose a new approach that aims to automatically manage the image catalogue and efficiently improve the quality of displayed images. The proposed approach is based on the design of a new no-reference Semantic Image Quality Assessment method using a conventional neural network (CNNs_SIQA) that employs a deep learning technique for perceived automatically assessing image quality. Obtained results have shown the effectiveness of our proposed approach in the automatic management of catalogues and in the quality improvement of displayed images.

Sonia Ouni, Karim Kamoun, Mohamed AlAttas

Chapter 7. Contextual Recommender Systems in Business from Models to Experiments

Abstract

Within the last decade, several researchers have focused on using contextual information to design new systems that generate personalized recommendations matching users specific contexts. In this respect, Recommended Systems (RS) have been used in different domains to assist users decision making by providing item recommendations and thereby improving the quality. Context-Aware Recommender Systems (CARS) go further, taking contexts (e.g., time, location, occasion, etc.) into consideration to suggest items that are appropriate to users specific contextual situations. We give in this chapter an overview of the current state of the art in CARS and decision-making process, associated types, challenges, limitations, and business adoptions. Experimental evaluation processes that can be performed to assess the quality of any contextual recommendation system is discussed. Also, an empirical evaluation between CARS and baseline recommender systems is performed on two benchmark datasets.

Khedija Arour, Rim Dridi

Chapter 8. An Overview of Multi-View Methods for Text Clustering

Abstract

Text clustering has become an important challenge in text mining and machine learning, which partitions a specific documents’ collection into groups according to certain similarity/dissimilarity criterion. With advances in information acquisition technologies, textual data can frequently be represented using different techniques, generating multi-view data. We propose in this chapter an overview of the existing clustering methods with a special emphasis on multi-view text clustering methods. We design a new categorizing model based on the main properties pointed out in the multi-view textual clustering method. To evaluate their performance, we perform extensive experiments on several real-world textual data sets. Based on the experimental results, we provide some insights for researchers who want to decide the best method to use when a task of clustering multi-view textual data is required.

Maha Fraj, Mohamed Aymen Ben HajKacem, Nadia Essoussi

Chapter 9. Real-Time K-Prototypes for Incremental Attribute Learning Using Feature Selection

Abstract

Mixed data streams escorted with new features are continuously generated at a very giant speed and volume. This makes it difficult for conventional clustering algorithms to create and to maintain groups of similar entities. Nonetheless, not all emerged attributes are relevant for mining and decision-making. In this paper, we propose an incremental attribute and object learning clustering method based on k-prototypes algorithm and using feature selection in a streaming data environment. Actually, we would like to investigate the impact of applying feature selection preprocessing technique in incremental unsupervised attribute learning space since it has not been evaluated so far. The obtained results highlight that our proposal achieves consistently better performance and gains more coherent clustering results in less time consuming. Also, performing feature selection before modeling data, in incremental attribute learning context, eases the learning process and speeds it up.

Siwar Gorrab, Fahmi Ben Rejab, Kaouther Nouira

Chapter 10. Applications of Industry 4.0 on Saudi Supply Chain Management: Technologies, Opportunities, and Challenges

Abstract

The Fourth Industrial Revolution (Industry 4.0) combines many recent technologies such as robotics, artificial intelligence, additive manufacturing, blockchain, Internet of Things, and many other technologies. Industry 4.0 could be used in logistics and supply chain management (SCM) referred to as SCM 4.0. For example, SCM 4.0 could be used in manufacturing, production, packaging, shipment, and transmission. The Kingdom of Saudi Arabia (KSA) has many opportunities in the SCM sector. The two holy mosques are located in the KSA. Also, the KSA has long borders on both the Red Sea and the Arabian Gulf. Besides, the KSA is a big petroleum country. These natural resources largely contribute to the Saudi national economy achieving continual growth. The KSA vision 2030 aims to diversify the sources of the national income. So, SCM 4.0 may represent a real opportunity for achieving this objective. This chapter addresses in details the different Industry 4.0 technologies and their recent applications in SCM sector. The chapter also describes the different opportunities and challenges of adopting SCM 4.0 in KSA.

Taha M. Mohamed, Abdulaziz Alharbi, Ibrahim Alhassan, Sherif Kholeif

Backmatter

Title: Machine Learning and Data Analytics for Solving Business Problems
Editors: Bader Alyoubi
Chiheb-Eddine Ben Ncir
Ibraheem Alharbi
Anis Jarboui
Publisher: Springer International Publishing
Electronic ISBN: 978-3-031-18483-3
Print ISBN: 978-3-031-18482-6
DOI: https://doi.org/10.1007/978-3-031-18483-3

Springer Professional

About this book

Table of Contents

Frontmatter

Chapter 1. Predicting Salaries with Random-Forest Regression

Chapter 2. Data-Driven Analysis of Microfinance and Social Loans Before and During the COVID-19 Pandemic Using Exploratory Analysis and Decision Tree Classifiers

Chapter 3. Identification of Credit Risks Using Cluster Analysis and Behavioural Scoring During the COVID-19 Pandemic

Chapter 4. Improving Sales Prediction for Point-of-Sale Retail Using Machine Learning and Clustering

Chapter 5. Telecom Customer Segmentation Using Deep Embedded Clustering Algorithm

Chapter 6. Semantic Image Quality Assessment Using Conventional Neural Network for E-Commerce Catalogue Management

Chapter 7. Contextual Recommender Systems in Business from Models to Experiments

Chapter 8. An Overview of Multi-View Methods for Text Clustering

Chapter 9. Real-Time K-Prototypes for Incremental Attribute Learning Using Feature Selection

Chapter 10. Applications of Industry 4.0 on Saudi Supply Chain Management: Technologies, Opportunities, and Challenges

Backmatter