Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry

Jung, Hail; Jeon, Jinsu; Choi, Dahui; Park, Jung-Ywn

doi:10.3390/su13084120

Open AccessArticle

Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry

¹

School of Management Engineering, Ulsan National Institute of Science and Technology, 50 UNIST-gil, Ulsan 44919, Korea

²

Graduate School of Interdisciplinary Management, Ulsan National Institute of Science and Technology, 50 UNIST-gil, Ulsan 44919, Korea

³

Graduate School of Technology and Innovation Management, Ulsan National Institute of Science and Technology, 50 UNIST-gil, Ulsan 44919, Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2021, 13(8), 4120; https://doi.org/10.3390/su13084120

Submission received: 9 March 2021 / Revised: 1 April 2021 / Accepted: 6 April 2021 / Published: 7 April 2021

(This article belongs to the Special Issue Fourth Revolution and Sustainability)

Download

Browse Figures

Versions Notes

Abstract

:

With sustainable growth highlighted as a key to success in Industry 4.0, manufacturing companies attempt to optimize production efficiency. In this study, we investigated whether machine learning has explanatory power for quality prediction problems in the injection molding industry. One concern in the injection molding industry is how to predict, and what affects, the quality of the molding products. While this is a large concern, prior studies have not yet examined such issues especially using machine learning techniques. The objective of this article, therefore, is to utilize several machine learning algorithms to test and compare their performances in quality prediction. Using several machine learning algorithms such as tree-based algorithms, regression-based algorithms, and autoencoder, we confirmed that machine learning models capture the complex relationship and that autoencoder outperforms comparing accuracy, precision, recall, and F1-score. Feature importance tests also revealed that temperature and time are influential factors that affect the quality. These findings have strong implications for enhancing sustainability in the injection molding industry. Sustainable management in Industry 4.0 requires adapting artificial intelligence techniques. In this manner, this article may be helpful for businesses that are considering the significance of machine learning algorithms in their manufacturing processes.

Keywords:

injection molding; quality prediction; regression; decision tree; autoencoder; machine learning; feature importance; characteristics importance

1. Introduction

Sustainable growth has become important for firms, especially in Industry 4.0. Integration among the physical and digital systems of production is the main concern of Industry 4.0 [1]. Industry 4.0 also enables continuous contact between machines, people, products, and even production materials. It is inextricably correlated with the Internet of Things (IoT), Machine-to-Machine (M2M) technology, and Machine Learning (ML). Among them, the most optimal solutions are machine learning and deep learning [2]. The results on a test renewable microgrid show that a machine learning-based structure can solve the problem with high accuracy [3]. One approach toward sustainable growth is by making a manufacturing process to shift the overall process to autonomous manufacturing, the core of which is information accessibility that enables the maintenance of manufacturing agility [4]. Specifically, automated data collection from machines and applying machine learning techniques to the collected data for automated quality prediction or fault detection are two significant factors driving Industry 4.0. By combining novel techniques based on machine learning or deep learning with manufacturing processes, the performance of the systems is enhanced and can be monitored in real-time, data-driven, and continuous learning from a more varied range of data sources [5].

One concern of deploying machines or deep learning techniques in manufacturing fields is the selection of appropriate algorithms. That is, as techniques are very sensitive to the types of input data and size of the data, appropriate techniques should be selected for a particular manufacturing type [6]. In other words, it is crucial to choose a manufacturing business for specific types of algorithms to fully enhance the manufacturing process. In this respect, this study attempted to employ machine learning and deep learning techniques in injection molding businesses. Specifically, we used prediction algorithms to verify whether they are suitable for quality prediction problems.

Injection molding is a complex production system. Injection molds are used as semifinal or final parts that can be used to produce the end products. Injection molds are frequently used in manufacturing plastic parts, and these parts are used in various businesses, such as automobile, shoe, and electronics manufacturing. As the injection molding industry supplies the base product for other manufacturing industries, it is considered a “root” industry, and the industry size is growing. With its importance growing every year, many studies have focused on designing the manufacturing process of injection molding and how to efficiently test the molding [7]. While the literature on methods to effectively improve the manufacturing process of injection molding is growing, relatively little attention has been given to how to employ modern techniques, such as machine learning or deep learning models, to predict the quality of injection molding products. This is of particular importance as injection molds are semifinal parts, and the customers of the molds use the mold parts to produce the final products. If defective molds are delivered to customers, it is highly likely that they will be dissatisfied.

Using a private injection molding production and quality dataset from Hanguk Mold, an injection molding company in Ulsan, South Korea, we deployed several machine learning and deep learning models to empirically test which models are suitable for injection molding businesses. The company manufactures molding products for car manufacturers, and the data we obtained from the company were a large manufacturing dataset from injection machines on one specific item. The injection machine data included injection time (s), filling time (s), plasticizing time (s), cycle time (s), clamp close time (s), cushion position (mm), switch over position (mm), plasticizing position (mm), clamp open position (mm), max injection speed (mm/s), max screw RPM (RPM), average screw RPM (RPM), max injection pressure (MPa), Max switch over pressure (MPa), max back pressure (MPa), average back pressure (MPa), barrel temperature (°C), and mold temperature (°C).

Of numerous machine learning algorithms, we deployed techniques that are frequently used in the manufacturing industries. Specifically, we used tree-based algorithms and regression-based and autoencoder models. Tree-based algorithms included random forest, gradient boost, XGBoost, LightGBM, and CatBoost. Regression-based algorithms included logistic regression and support vector machine. Finally, we also used an autoencoder model. Among several models, we find that the autoencoder model performs well in quality prediction problems in injection molding compared to regression-based and decision tree-based models. This is largely because of the complexity of the input variables in the injection model. Autoencoder models generally have strengths in settings with complex input features. Furthermore, we calculated the feature and characteristics importance of the predictive models to investigate which covariates are significant factors that determine the quality of the molding. We report that the models are generally in close agreement with the most influential predictors. Feature importance tests found that the molding temperature, hopper temperature, injection time, and cycle time factors were largely influential. Such common findings imply that the variation in the values of the abovementioned variables is a major cause of production defects; therefore, we highlight the importance of monitoring these variables in injection molding production.

This study aimed to apply modern machine learning and deep learning techniques to the injection molding business. Specifically, focusing on the quality prediction issue, we showed that autoencoder models are suitable for such businesses. This study contributes to the literature and to the future sustainable growth of injection molding businesses. We contribute to the literature by showing that autoencoder models have high explanatory power in explaining the quality of injection molds. From our knowledge, this is the first attempt to employ machine learning algorithms in plastic injection molding businesses and horserace the performances of models. Our massive comparison among several models showed that an autoencoder-based model outperforms other machine learning models. Furthermore, we contribute to on-site manufacturing businesses by showing the key variables that influence product quality. With over 50 real-time variables collected during the injection molding process, it is difficult for humans to identify influential variables. However, with the help of modern techniques, we found that temperature and time are important features, and such findings may be used by other injection molding companies. Contributions of the research are similar to the prior studies that apply modern statistical methods in practical businesses [8].

While injection molding businesses have long desired to figure out the main drivers that affect the quality of the product, this was not easy as the relationship between variables is rather complex. Complex relationships are not well captured in classical statistical models. With this regard, we deployed machine learning techniques to investigate the causes. This is possible as complexity is not a hurdle for machine learning models. The findings that molding temperature, hopper temperature, injection time, and cycle time are major factors are important for businesses’ sustainable growth. Defective items are costly for both manufacturing companies and the environment. For manufacturers, since they cannot sell defective items, they waste resources. From an environmental perspective, such wasted defective items would harm the environment. To reduce the manufacturing cost and environmental risk, it is critical for businesses to understand the main factors that cause defects. By monitoring important features suggested in the article, firms may reduce the defect ratio, which would reduce the manufacturing costs and environmental risks and would in the end increase their advantages in sustainable growth.

Furthermore, injection molding businesses are facing challenges because the quality cost is increasing due to the wage increment. Ordinary injection molding firms set a couple of employees beside each injection machine to check the quality of the manufactured product and decide whether the product is defective or not. A medium-sized injection molding manufacturer that operates around a hundred injection machines has over two hundred employees that check the quality of products. This generates a huge cost especially in countries with high income. Therefore, it has long been questioned in the manufacturing field whether this cost could be minimized by using recent machine learning algorithms. However, due to lack of data, empiricists found it difficult to analyze the injection molding data and report whether the machine learning techniques have the potential to replace human labor, at least in quality monitoring. In this manner, using a large dataset generated from injection machines, we tried to investigate whether recent machine learning-based classification algorithms can well classify items by their quality. We found that an autoencoder-based model outperforms other models and that the performance of the autoencoder model is suitable to be applied in real injection molding businesses. This also means that applying machine learning techniques to the manufacturing sites may potentially reduce quality monitoring costs, which was a big hurdle that holds one back from sustainable growth.

We begin by presenting a literature review of both the injection molding industry and modern machine learning and deep learning techniques used in this study in Section 2. We describe the data and the methodologies used for quality prediction in injection molding in Section 3. The main results are provided in Section 4, where we also provide descriptive statistics, model performance comparisons, and feature importance. We then discuss the findings in Section 5 and, finally, the conclusions in Section 6.

2. Literature Review

2.1. Injection Molding

Enhancing production efficiency has long been a research question in the injection-molding industry. Prior studies largely focused on methods to improve the cooling system of the injection molding process, enhance the energy consumption, alter cavity design, and improve the scheduling policy.

Temperature is a significant factor in determining the quality of the product. This is because the injection molding process involves melting the resin and subsequent cooling of the manufactured product. With its importance underscored, the literature attempted to enhance the layout design of the cooling system of injection molding. Searching on the Google Scholar data source, we were able to list related articles. For instance, a heuristic searching algorithm framework was used to develop the cooling circuits in the layout designs [9], and convex optimization models were further studied to improve the energy transfer efficiency [10]. Furthermore, another strand of literature deployed topology optimization to simplify the cooling process analysis [11]. K. J. Lee et al. (2020) [12] performed unsupervised probability matching between each instance and output based on injection molding data in the semiconductor industry to generate a training dataset with one-to-one relationships and apply k-nearest neighbor (KNN). It performed better than simply applying supervised learning methods such as support vector machine, random forest, and KNN.

Another critical research topic in the injection-molding industry is efficient energy usage. The guideline for characterizing the energy consumption around the injection molding process consists of five steps [13]. Under these guidelines, we can estimate a variety of injection molding manufacturing processes and products by considering the theoretical minimum energy that was computed with part design and process planning. Thus, studies have largely focused on a variety of perspectives to enhance the efficiency and sustainability of the injection-molding manufacturing processes. Regarding the literature on cavities, which constitute a major part of injection molding [14], literature focused on ways to save manufacturing time. One approach was to exploit the intelligent cavity layout design system to help injection molding designers in cavity design steps [15]. Another study examined the parting surface and cavity blocks in a computer-aided injection molding design system [16].

Recent studies have also investigated how the optimization of the scheduling of injection molding production may enhance manufacturing efficiency. For example, a deep Q-network was deployed to determine the scheduling policy to minimize the total tardiness [4]. The authors found that the deep reinforcement learning method outperformed the dispatching rules that are popularly used for minimizing the total weighted tardiness. Another recent study is transfer learning between different injection molding processes to reduce the amount of data needed for model training [17]. The authors used different approaches to ANN models; 16 training samples provided an average R2 value of 0.88 in this paper.

Ke, K.-C. et al. (2021) [18] filtered out outliers in the input data and converted the measured quality into a quality class used as output data. the prediction accuracy of the MLP model was improved, and the quality of finished parts was classified into various quality levels. The model classified “qualified,” “unqualified,” and “to-be-confirmed” and added quality assessments to only “to-be-confirmed” products, significantly reducing quality management costs.

2.2. Machine Learning

Studies on machine learning and its applications are proliferating. Focusing on its implications for solving issues in manufacturing businesses, research has focused on predicting failures [6]. Cinar et al. (2020) [14] and Binding, Dykeman, and Pang (2019) [19] forecasted the downtime of manufacturing machines using real-time prediction models. They utilized unstructured historical machine data to train the machine learning classification algorithms, including random forest, XGBoost, and logistic regression, to predict machine failures [6]. Qi, X et al. (2019) [20] conducted a study to apply neural network algorithms to complete additive manufacturing process chains from design to post-treatment. Yang, He, and Li (2020) [21] employed a machine learning-based approach to obtain an appropriate estimation model for the power consumption of the mask image projection stereolithography process. Among stepwise linear regression, shallow neural network, and stacked autoencoders, stacked autoencoders had the best performance. Reference [22] researched the quality control of continuous flow manufacturing. The authors labeled data with random forest-based pseudo-labeling and deployed recurrent neural network models.

Ruey-Shiang, G et al. (2020) [23] proposed a random forest model to detect the mean shifts in multivariate control charts during production. The proposed model well detected the moving average and was able to identify the exact variables. M. Strano et al. (2006) [24] proposed the logistic regression for the empirical determination of the locus of the principal planar strains where failure is most likely to occur. They directly derived the probability of the failure as a function of different predictor variables through the model. Pal, M. (2005) [25] compared classification accuracy between Random forest and SVM for remote detection. Zhang, C. et al. (2019) [26] built a two-stage energy-efficient decision-making mechanism using random forest. The authors selected appropriate control strategies for different occasions in the manufacturing process. Alhamad, I. M. et al. (2019) [27] compared the machine learning models that predict faults during the wafer fabrication process of the semiconductor industries. The combinations of feature selection methods and four models were k-nearest neighbor (KNN), random forest (RF), Naïve Bayes (NB), and decision tree (DT). The authors then compared recall, precision, F-measure, and false-positive rates. Jo, H. et al. (2019) [28] compared machine-learning algorithms for predicting the endpoint temperature of molten steel in a converter in steel-making processes. Omairi, A. et al. (2021) [29] proposed machine learning algorithms to detect product defects in cyber-physical systems in additive manufacturing. The authors argued that the inclusion of AI frameworks in automated tasks could improve the manufacturing process efficiently.

There has been a recent study to evaluate multi-level quality control based on various machine learning and blockchain-based solutions [30]. The authors found that XGBoost performs well by comparing the accuracy, precision, and recall of XGBoost and KNN algorithms.

Regarding injection molding, a study compared linear and kernel support vector machine (SVM) classifiers in datasets corresponding to product defects in an industrial environment around a plastic injection molding machine [31]. The author compared linear and kernel SVM classifiers in datasets corresponding to product faults in an industrial environment with a plastic injection molding machine. Another study used images of injection-molding products and applied deep learning algorithms [32]. The study found that long short-term memory (LSTM) fitted better than convolutional neural network (CNN) models in defect classification problems using image data. Although machine learning techniques based on image data are surging, not much research has been conducted on applying such methodologies using injection machine data. This research aims to apply several machine learning algorithms to the data gathered from injection machines.

3. Data and Methodology

3.1. Data

This study used a large injection machine dataset gathered from actual injection molding production at Hanguk Mold, a company in South Korea. Table 1 provides a description of all the variables that are available from the injection machine, and Figure 1 provides that process diagram of injection molding. There are over 50 available variables, and we selected variables that are considered more important in the manufacturing sites.

Table 2 provides the summary statistics of the data divided by the quality of the injection molding. As the defect ratio is relatively low, we oversampled the defect data using the synthetic minority oversampling technique (SMOTE) method. The summary statistics comparing the mean value of the injection machine variables for the original dataset are reported in Panel A, and the oversampled data are provided in Panel B of Table 2. The univariate comparison result shows that, in general, there is a statistically significant difference in mold temperature-related measures injection time and plasticizing time for both original and oversampled datasets. Given that interpreting results from univariate analyzes have several endogeneity issues, we further deployed machine learning techniques to capture how the variation in such features can explain the quality of the product.

3.2. Regression-Based Model

3.2.1. Logistic Regression

Logistic regression analysis is a representative method for linear-based classification algorithms. This algorithm is the basis of deep learning. A typical regression model estimates the linear regression equation below by determining the distribution characteristics of the features.

y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + \dots + β_{n} x_{n}

(1)

The most important aspect of logistic regression is to model the probability of an event. Instead of y, the probability of belonging to category 1,

p = P (Y = 1)

is modeled, indicating a numeric value between 0 and 1 [33].

p = \frac{1}{1 + e^{- (β_{0} + β_{1} x_{1} + β_{2} x_{2} + \dots + β_{n} x_{n})}}

(2)

Then, the probabilities are categorized using appropriate thresholds. In this study, the threshold was set to 0.5 to classify good and bad products. However, logistic regression has basic assumptions that must be met, such as linearity in the logit for continuous variables, independence of errors, and absence of multicollinearity [34]. Using such model in our settings, the linear regression that passed the sigmoid function is a non-linear hyperplane. If we use sensor data to find the optimal hyperplane, we can explain which feature is important because it is derived by a hyperplane.

3.2.2. Support Vector Machine

A hyperplane is a decision boundary that classifies the data in high dimensions. Compared to logistic regression, the SVM can classify high-dimensional data that cannot be classified by linear classification using hyperplanes. Providing a kernel function in higher-dimensional data allows for a non-linear classification of observations in the original data [35].

The support vector is the data closest to the decision boundary. The SVM uses a margin, the distance between these support vectors, to find the optimal decision boundary. It is very important to select the proper kernel function as it explains the feature space where the training data will be classified [36].

3.3. Tree-Based Model

Another popularly used model in the manufacturing process is the decision tree-based model. Among several models that use decision trees, we deployed random forest, gradient boosting, lightGBM, and CatBoost algorithms following prior studies that developed models for computer numerical control (CNC) machines [26]. The tree model consists of decision trees, and the advantage of this is we can extract feature importance to figure out which feature is important for quality prediction. Each of the five algorithms can extract its own method to extract important features, and we compare every important feature.

3.3.1. Random Forest

Random forest is an important machine learning algorithm for pattern recognition owing to its low cost. The main principle of the training strategy is bagging. This implies that the random forest is derived from ensemble sampling without replacement from part of the dataset [37]. The remaining data are called out-of-bag and are used to evaluate the model performance [13]. Most boosting or bagging algorithms are based on decision trees [38]. The initial state of the node creates other nodes that contain features directed upward. Consequently, many decision trees were used to classify each set of data with sampling. For this method, individual decision trees can have low accuracy compared to a decision tree made using the total dataset. Hence, it is better to determine the total result of each tree because each tree can classify trained data that complement each other [39].

3.3.2. Gradient Boosting

Boosting is another ensemble method that gradually improves train error by using the residual of the models. Gradient boosting calculates the residual error that is identical to the gradient to make a reasonable model [40]. The framework in which the residuals are calculated is the same as the way the loss of the model is directed in the opposite direction of the gradient. Hence, this algorithm is called gradient boosting [41].

3.3.3. XGBoost

XGBoost is a helpful approach for optimizing the gradient boosting algorithm by removing missing values and eliminating overfitting issues using parallel processing. System optimization in XGBoost is achieved by implementing parallelization, tree pruning, and hardware optimization [42].

3.3.4. LightGBM

Although XGBoost computing with high parallelism is faster than GBM, a method that can reduce the training time is required for large datasets [43]. Unlike XGBoost, LightGBM (LGBM) showed better performance in the case of training time and memory efficiency as it offers superior performance and parallel computing capabilities for large amounts of data and more recently supports additional GPUs. LGBM has been developed in a way that inherits its advantages and complements the disadvantages of XGBoost. However, applications to small datasets of less than 10,000 are prone to overfitting. GBM is stronger for the overfitting problem using the level-wise method, but it requires time to balance. LGBM uses the leaf weight method [44]. Instead of balancing the tree, it continuously splits leaf nodes with maximum delta loss, expands the depth, and generates asymmetric rule trees [45]. This method minimizes the predictive error loss compared with the balanced tree split method as learning repeats.

3.3.5. CatBoost

CatBoost can perform better than other GBM algorithms by substituting the ordering-principal concept to solve the problem of prediction shift due to traditional data leakage and pre-processing for category variables with high cardinality [46]. The first advantage is the reduction in learning time due to improvements in the categorical variable handling methods. Most GBMs use decision trees as base predicators, but with categorical variables, training takes a long time. Another advantage is the use of ordered boosting techniques to calculate leaf values to solve the preference shift problem [47].

3.4. Autoencoder-Based Model

For prediction manufacturing quality, the length of training data is important, and a deep framework overwhelms other machine learning methods. It means that the deep learning techniques considered can be applied to establish accurate manufacturing fields. Similarly, deep feature learning is beneficial to explore sophisticated relationships between multiple features of manufacturing and quality [48].

An autoencoder consists of an encoder that maps the input to the hidden layer and a decoder that maps the encoded data back to the reconstruction [49]. First, it compresses the original input data to a vector of lower dimension and then decodes this vector to the original representation of the data [50]. A stacked autoencoder is an autoencoder with multiple hidden layers. As shown in Figure 2, the structure is symmetric with respect to the middle-hidden layer, and the hidden layers have fewer nodes than the nodes in the input and output layers. Autoencoder models train from high-dimensional input to low-dimensional bottleneck intervals by repeatedly compressing and releasing the mapping process. In this process, an information bottleneck is created, and it automatically learns the ability to distinguish between important and non-critical features for restoring input samples. However, the autoencoder model incorporates normal data on developing the network. If the input data are suitable, the results are often significant. If data projected to higher dimensions using a kernel is well classified using a particular hyperplane, the machine learning model may be more appropriate. Therefore, we hypothesized that the autoencoder model would perform better because the data are correlated and the classification results in the high-dimensional kernel are not significant.

3.5. Time Complexity and Model Evaluation

The complexity usually depends on the size of the data. It is important to check the complexity because consuming resources and less time matter in the real world. In other words, if the results of the model are similar, a model with less complexity is more efficient in terms of resources and time savings and should be applied in practice, and it is highly related to the symbiotic relationship between humans and robots [51]. Logistic regression, a type of linear regression, has the advantage of having no parameters, but there is also no way to control the complexity of the model. Autoencoder is also a combination of multiple logistic regression analyses, making it difficult to calculate complexity. In a computable model of complexity, we put the data consisting of n instances that have m attributes. SVM has O(n2), and it is considered as time complexity. The model complexity of a decision tree, one of the basic methods of a tree-based model, is O(mn2) [52]. The complexity of random forest is O(Mmn log n). Different tree-based algorithms employ methods to reduce the complexity of their own methods.

For the binary model evaluation, we set four different elements to check the performance of the models. True positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs). True elements are those that the model classifies correctly, and false are those not classified correctly by the model. Accuracy is the most intuitive metric because it does not require statistical interpretation.

Accuracy = \frac{TP + TN}{TP + FP + TN + FN}

(3)

Precision = \frac{TP}{TP + FP}

(4)

Recall = \frac{TP}{TP + FN}

(5)

F 1 score = 2 \times \frac{(Precision \times Recall)}{(Precision + Recall)}

(6)

Generally, finding a defective item is more important than a good item. Therefore, we place the defect item to class ”1” and the good item to class ”0” for proper model evaluation. These data are very imbalanced as the ratio of defective items is less than 1.5%. Therefore, we need an additional model metric because, if our model cannot classify anything and every item is 0, then its accuracy is 98.5. Therefore, precision, recall, and F1-score usually check model performance in the case of imbalance and binary. Precision is the ratio of good items that the model predicts as actual good items [37]. Recall is the ratio of actual good items to the sum of actual good items and factual good items. Generally, precision and recall have a tradeoff as it is a different view of the model evaluation metric. The F1-score is the harmonic mean of precision and recall complementing when it comes to imbalance [53].

4. Main Results

With that said, we employed several machine learning algorithms to observe and compare the performances of models. Specifically, we employed logistic regression, support vector machine, random forest, XGBoost, CatBoost, LightGBM, and autoencoder models. Those models can be categorized as regression-based models, tree-based models, and autoencoder-based models.

The model results are presented in Table 3. Panel A reports the results for the regression-based models. Because logistic regression is a method of classifying the results of the linear regression through a sigmoid function, four statistical assumptions—linearity, homoscedasticity, independence, and normality—must be satisfied. However, in the case of manufacturing data, variables are highly multicollinear, and some features are not invariant because of their unique characteristics. Consequently, the statistical assumption is difficult to satisfy owing to its unique characteristics, and thus the F1-score is remarkably poor. Unlike tree algorithms, recall was better in regression models. This implies that regression models detect 90% of defective items but also highly misclassify good items as bad items. This is the result of SMOTE because, when we train the model using oversampled data, SMOTE sets a 50:50 ratio of defective items and good items. Therefore, we note almost 90% recall but less than 1% precision. The linear SVM is almost the same because finding a hyperplane that classifies good and bad items is challenging owing to poor statistical assumptions. Further, it is difficult to find a good linear hyperplane, while good and bad items are mixed in high dimensions.

Panel B of Table 3 provides the results of the tree-based models. We found that the random forest of the bagging method outperforms the boosting-type method. In non-parametric data, sampling without replacement is better than that based on residuals for model update. It seems that, using manufacturing data, non-parametric methods are better than parametric methods. However, in the image classification problem, the human error was set at 5%. This is not applicable to the actual industry.

The autoencoder model results are presented in Panel C of Table 3. The stacked autoencoder classified most of the products accurately without misclassification. In the case of 5617 quality data, 70% were used for training and 30% for testing (1605). It has different characteristics compared to machine learning algorithms in that the network is trained only as a good item. As a result of classifying 1605 quality data and 125 defective data, the F1-score was 0.9727. The reason why the autoencoder is better than the others is non-parametric and is not significantly affected by the distribution between features. It detects every defect item; therefore, the recall score is 1 and the precision is 0.9469, and thus it can be highly compared to human error (i.e., 5%). Another advantage of this method is that only good items are trained; thus, all defective items can be used for model evaluation. Because many manufacturing data are imbalanced, a sampling method is necessary to create a model. However, it can skip this process, and thus it is more accurate and easier to use for classification.

5. Discussion

This paper finds that autoencoder-based models outperform tree-based models. In the tree-based models, there are two main ways of developing models: bagging and boosting. Bagging focuses more on how to organize the data well before building the model, and boosting focuses more on sensor values in terms of developing the model with updates of the residuals for which feature. Referring to Table 3, gradient boosting has the highest value, 0.7638. It means the model is classifying 76% of defective products. However, only 55% of the results determined by the model to be defective products were accurate. It is important to find out which product is a defective item because of cost. Therefore, we applied a stacked autoencoder. It is a method for anomaly detection through differences between input and output data in the process of learning and restoring networks that reduce the dimension of the original data. The advantage of this is that the model does not need any defective items. It is useful in low-cost injection molding to let the model be sustainable. This is because the labeling cost is high in the process of obtaining data to make the model. Since the network is learned only from good items, labeling costs are reduced, and results are very good as shown in Panel C of Table 3. In other words, in injection molding, there is a stable pattern in the case of good products, and in the case of defective products, there is a difference so it can be classified well.

Furthermore, with the significant results of machine learning methods in predicting the quality of injection molding, the variables that drive such results are also important. That is, among dozens of injection machine variables, what are the main important features that lead to quality problems in injection molding businesses. We employ feature importance tests for each model used in the analysis.

Figure 3 shows the combined feature importance graph. Regardless of the models used, we found that molding temperature, hopper temperature, injection time, and cycle time are important variables commonly selected by machine learning techniques. These findings contribute to manufacturing sites. With over 50 control variables on injection machines, workers find it difficult to efficiently control each variable. Using important features selected by machine learning algorithms may reduce the worker’s time controlling machines and consequently increase the production level.

This study contains two limitations. The first limitation is the limitation of data. Threats to validity are an important category to be discussed in machine learning studies [54,55]. Among several categories of threats to validity, this paper is mostly concerned with external validity. That is, the findings may limit the ability to generalize the results beyond the experiment setting. As the results are from the plastic injection molding business, our findings may be different in other molding businesses. As the data are from a manufacturing company that is known for plastic injection, data might be biased. That is, the data may contain plastic injection molding characteristics that may not be applicable in other types of molding. Another limitation of the research is the insufficient knowledge on investigating clear reasons what caused the defect. Second, in manufacturing, the results of the model such as accuracy, recall, precision, and F1-score are important, but the explanation of the results is often more important. Finding causes for the outcome in the business is needed, but the current study may not have sufficiently performed it. Feature importance is a test to find important variables according to the “classification” of the model, and it is another problem whether they are actually important. To solve this problem, there are three methods: combining an explainable model or changing the structure of deep learning to understand which active functions’ reaction results affect the results or using an explainable model. In the future, another example applied with different injection machine data will be needed, and a model structure that focuses on the cause rather than on the outcome of the model through explainable models will need to be in place.

6. Conclusions

Quality issues have long been a critical concern in injection-molding businesses. Such technical issues became more important for firms’ sustainable growth, especially in the Industry 4.0 era. We believe that important innovations that would keep the manufacturing industries as leading roles in the market are an adaptation process to the new environment. With many artificial intelligence models introduced every day, manufacturing industries should also try to be more innovative by applying such modern techniques in their current manufacturing processes. Furthermore, quality efficiency is an important concern for manufacturing businesses for sustainable development, and this is also very much related to issues of energy efficiency [56,57]. If enterprises want to reduce cost and find or retain clients, they should offer the products with the highest quality and reasonable prices.

Hence, injection molding firms attempt to improve their production efficiency and enhance the quality of prediction, monitoring key variables that influence the quality of injection molding and are the main drivers. As the manufacturing environment is becoming more dynamic with an increased number of products, not responding to the environment with agility causes customers’ dissatisfaction and, therefore, causes a negative influence on the companies’ competitiveness in the market [4]. Therefore, intelligent solutions that may solve such complex problems are required, and many prior studies have examined the importance of Industry 4.0 for enterprises in a changeable and innovative environment [58,59,60].

Injection molding manufacturing consists of complex production systems because many parts are combined, and the specifications of each mold are different. Moreover, mold products have different processes, and all these factors increase the complexity of the dynamic of the manufacturing environment. From the perspective of the data gathered during the process, this also implies non-linear and complex relationships among variables. Therefore, employing statistical methods based on linearity assumptions may not be effective. Using quality prediction as a testing ground, this study performed a comparative analysis of various methodologies in the machine learning architecture. At the upper level, we demonstrated that machine learning methods can help improve the understanding of quality problems in the injection molding industry. Using the large real production dataset gathered from the injection machines, we found that machine learning models are generally useful for quality prediction. Autoencoder and random forest are the best performing methods. Specifically, we showed that the autoencoder model outperforms other tree-based machine learning algorithms in terms of accuracy and F1-score.

We also tracked down the advantages of these machine learning algorithms to accommodate non-linear interactions that are often missed in other classical methods. The injection molding process is a combination of numerous variables, such as temperature, pressure, and velocity, and the relationship between these variables is not linear. Thus, methods that have comparative advantages in handling non-linear relationships are necessary.

In addition to the prediction results of several machine learning methods, we tested which factors are key variables that influence the quality of injection molding products. We found that molding temperature, hopper temperature, injection time, and cycle time are important variables commonly selected by machine learning techniques.

Author Contributions

Conceptualization, H.J. and J.-Y.P.; methodology, J.J. and D.C.; software, J.J. and D.C.; validation, H.J. and J.-Y.P.; formal analysis, H.J.; investigation, H.J., J.J., D.C., and J.-Y.P.; resources, H.J.; data curation, H.J.; writing—original draft preparation, H.J., J.J., and D.C.; writing—review and editing, H.J.; visualization, J.J. and D.C.; supervision, H.J. and J.-Y.P.; projection administration, H.J.; funding acquisition: J.-Y.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Please contact the corresponding author regarding the data. The data is available upon request.

Acknowledgments

We thank Hanguk Mold for providing the manufacturing data. We also thank Inter X for preprocessing the data and providing us the rich data for analysis.

Conflicts of Interest

The authors declare no conflict of interest.

References

Carvalho, T.P.; Soares, F.A.; Vita, R.; Francisco, R.D.; Basto, J.P.; Alcalá, S.G. A systematic literature review of machine learning methods applied to predictive maintenance. Comput. Ind. Eng. 2019, 137, 106024. [Google Scholar] [CrossRef]
Borowski, P.F. Innovative Processes in Managing an Enterprise from the Energy and Food Sector in the Era of Industry 4.0. Processes 2021, 9, 381. [Google Scholar] [CrossRef]
Lan, T.; Jermsittiparsert, K.; Alrashood, S.T.; Rezaei, M.; Al-Ghussain, L.; Mohamed, M.A. An Advanced Machine Learning Based Energy Management of Renewable Microgrids Considering Hybrid Electric Vehicles’ Charging Demand. Energies 2021, 14, 569. [Google Scholar] [CrossRef]
Lee, S.; Cho, Y.; Lee, Y.H. Injection Mold Production Sustainable Scheduling Using Deep Reinforcement Learning. Sustainability 2020, 12, 8718. [Google Scholar] [CrossRef]
Oluyisola, O.E.; Sgarbossa, F.; Strandhagen, J.O. Smart Production Planning and Control: Concept, Use-Cases and Sustainability Implications. Sustainability 2020, 12, 3791. [Google Scholar] [CrossRef]
Çınar, Z.M.; Abdussalam Nuhu, A.; Zeeshan, Q.; Korhan, O.; Asmael, M.; Safaei, B. Machine Learning in Predictive Maintenance towards Sustainable Smart Manufacturing in Industry 4.0. Sustainability 2020, 12, 8211. [Google Scholar] [CrossRef]
Low, M.L.H.; Lee, K.S. Mould data management in plastic injection mould industries. Int. J. Prod. Res. 2008, 46, 6269–6304. [Google Scholar] [CrossRef]
Alam, T.M.; Shaukat, K.; Hameed, I.A.; Luo, S.; Sarwar, M.U.; Shabbir, S.; Li, J.; Khushi, M. An Investigation of Credit Card Default Prediction in the Imbalanced Datasets. IEEE Access 2020, 8, 201173–201198. [Google Scholar] [CrossRef]
Li, C.; Li, C.; Mok, A. Automatic layout design of plastic injection mould cooling system. Comput. Aided Des. 2005, 37, 645–662. [Google Scholar] [CrossRef]
Liang, J.-Z. An optimal design of cooling system for injection mold. Polym. Plast. Technol. Eng. 2002, 41, 261–271. [Google Scholar] [CrossRef]
Li, Z.; Wang, X.; Gu, J.; Ruan, S.; Shen, C.; Lyu, Y.; Zhao, Y. Topology optimization for the design of conformal cooling system in thin-wall injection molding based on BEM. Int. J. Adv. Manuf. Technol. 2018, 94, 1041–1059. [Google Scholar] [CrossRef]
Lee, K.J.; Yapp, E.K.Y.; Li, X. Unsupervised Probability Matching for Quality Estimation with Partial Information in a Multiple-Instances, Single-Output Scenario. In Proceedings of the 2020 15th IEEE Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway, 9–13 November 2020; pp. 1432–1437. [Google Scholar] [CrossRef]
Madan, J.; Mani, M.; Lyons, K.W. Characterizing energy consumption of the injection molding process. In ASME 2013 International Manufacturing Science and Engineering Conference Collocated with the 41st North American Manufacturing Research Conference; American Society of Mechanical Engineers Digital Collection: Madison, WI, USA, 2013. [Google Scholar]
Low, M.; Lee, K. A parametric-controlled cavity layout design system for a plastic injection mould. Int. J. Adv. Manuf. Technol. 2003, 21, 807–819. [Google Scholar] [CrossRef]
Hu, W.; Masood, S. An intelligent cavity layout design system for injection moulds. Int. J. CAD CAM 2002, 2, 69–75. [Google Scholar]
Fu, M.; Fuh, J.; Nee, A. Core and cavity generation method in injection mould design. Int. J. Prod. Res. 2001, 39, 121–138. [Google Scholar] [CrossRef]
Lockner, Y.; Hopmann, C. Induced network-based transfer learning in injection molding for process modelling and optimization with artificial neural networks. Int. J. Adv. Manuf. Technol. 2021, 112, 3501–3513. [Google Scholar] [CrossRef]
Ke, K.-C.; Huang, M.-S. Quality Classification of Injection-Molded Components by Using Quality Indices, Grading, and Machine Learning. Polymers 2021, 13, 353. [Google Scholar] [CrossRef]
Binding, A.; Dykeman, N.; Pang, S. Machine Learning Predictive Maintenance on Data in the Wild. In Proceedings of the 2019 IEEE 5th World Forum on Internet of Things (WF-IoT), Limerick, Ireland, 15–18 April 2019; pp. 507–512. [Google Scholar] [CrossRef]
Qi, X.; Chen, G.; Li, Y.; Cheng, X.; Li, C. Applying Neural-Network-Based Machine Learning to Additive Manufacturing: Current Applications, Challenges, and Future Perspectives. Engineering 2019, 5, 721–729. [Google Scholar] [CrossRef]
Yang, Y.; He, M.; Li, L. Power consumption estimation for mask image projection stereolithography additive manufacturing using machine learning based approach. J. Clean. Prod. 2020, 251, 119710. [Google Scholar] [CrossRef]
Jun, J.-H.; Chang, T.-W.; Jun, S. Quality Prediction and Yield Improvement in Process Manufacturing Based on Data Analytics. Processes 2020, 8, 1068. [Google Scholar] [CrossRef]
Guh, R.S.; Shiue, Y.R. An effective application of decision tree learning for on-line detection of mean shifts in multivariate control charts. Comput. Ind. Eng. 2008, 55, 475–493. [Google Scholar] [CrossRef]
Strano, M.; Colosimo, B.M. Logistic regression analysis for experimental determination of forming limit diagrams. Int. J. Mach. Tools Manuf. 2006, 46, 673–682. [Google Scholar] [CrossRef]
Pal, M. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 2005, 26, 217–222. [Google Scholar] [CrossRef]
Zhang, C.; Jiang, P. Sustainability Evaluation of Process Planning for Single CNC Machine Tool under the Consideration of Energy-Efficient Control Strategies Using Random Forests. Sustainability 2019, 11, 3060. [Google Scholar] [CrossRef] [Green Version]
Alhamad, I.M.; Ahmed, W.K.; Ali, H.Z. Boosting teaching experience in mechanical engineering courses using additive manufacturing technologies. In Proceedings of the 2019 Advances in Science and Engineering Technology International Conferences (ASET), Dubai, UAE, 26 March–11 April 2019; IEEE: New York, NY, USA; pp. 1–6. [Google Scholar]
Jo, H.; Hwang, H.J.; Phan, D.; Lee, Y.; Jang, H. Endpoint temperature prediction model for LD converters using machine-learning techniques. In Proceedings of the 2019 IEEE 6th International Conference on Industrial Engineering and Applications (ICIEA), Tokyo, Japan, 12–15 April 2019; pp. 22–26. [Google Scholar]
Omairi, A.; Ismail, Z.H. Towards Machine Learning for Error Compensation in Additive Manufacturing. Appl. Sci. 2021, 11, 2375. [Google Scholar] [CrossRef]
Shahbazi, Z.; Byun, Y.-C. Integration of Blockchain, IoT and Machine Learning for Multistage Quality Control and Enhancing Security in Smart Manufacturing. Sensors 2021, 21, 1467. [Google Scholar] [CrossRef] [PubMed]
Ribeiro, B. Support vector machines for quality monitoring in a plastic injection molding process. IEEE Trans. Syst. Man Cybern. 2005, 35, 401–410. [Google Scholar] [CrossRef]
Nagorny, P.; Pillet, M.; Pairel, E.; Le Goff, R.; Loureaux, J.; Wali, M.; Kiener, P. Quality prediction in injection molding. In Proceedings of the 2017 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), Annecy, France, 26–28 June 2017; pp. 141–146. [Google Scholar]
Bender, R.; Grouven, U. Ordinal logistic regression in medical research. J. R. Coll. Physicians Lond. 1997, 31, 546–551. [Google Scholar] [PubMed]
Stoltzfus, J.C. Logistic regression: A brief primer. Acad. Emerg. Med. 2011, 18, 1099–1104. [Google Scholar] [CrossRef]
Lieber, D.; Stolpe, M.; Konrad, B.; Deuse, J.; Morik, K. Quality prediction in interlinked manufacturing processes based on supervised & unsupervised machine learning. Procedia Cirp. 2013, 7, 193–198. [Google Scholar] [CrossRef] [Green Version]
Orrù, P.F.; Zoccheddu, A.; Sassu, L.; Mattia, C.; Cozza, R.; Arena, S. Machine Learning Approach Using MLP and SVM Algorithms for the Fault Prediction of a Centrifugal Pump in the Oil and Gas Industry. Sustainability 2020, 12, 4776. [Google Scholar] [CrossRef]
Taghi, M.K.; Moiz, G.; Jason, V.H. An empirical study of learning from imbalanced data using Random Forest. In Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007), Patras, Greece, 29–31 October 2007. [Google Scholar]
Breiman, L. Bagging predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef] [Green Version]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Apté, C.; Weiss, S. Data mining with decision trees and decision rules. Future Gener. Comput. Syst. 1997, 13, 197–210. [Google Scholar] [CrossRef]
Bhattacharya, S.; Krishnan S, S.R.; Maddikunta, P.K.R.; Kaluri, R.; Singh, S.; Gadekallu, T.R.; Alazab, M.; Tariq, U. A Novel PCA-Firefly Based XGBoost Classification Model for Intrusion Detection in Networks Using GPU. Electronics 2020, 9, 219. [Google Scholar] [CrossRef] [Green Version]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
Fan, J. Real-Time GDP Nowcasting in New Zealand: An Ensemble Machine Learning Approach: A Thesis Presented for the Degree of Master of Philosophy. Ph.D. Thesis, School of Natural and Computational Sciences Massey University, Auckland, New Zealand, 2019. [Google Scholar]
Rokad, B.; Karumudi, T.; Acharya, O.; Jagtap, A. Survival of the Fittest in PlayerUnknown BattleGround. arXiv 2019, arXiv:1905.06052. [Google Scholar]
Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Advances in Neural Information Processing Systems; Neural Information Processing Systems (NIPS), Inc.: San Diego, CA, USA, 2018; pp. 6638–6648. [Google Scholar]
Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv 2018, arXiv:1810.11363. [Google Scholar]
Bai, Y.; Sun, Z.; Deng, J.; Li, L.; Long, J.; Li, C. Manufacturing Quality Prediction Using Intelligent Learning Approaches: A Comparative Study. Sustainability 2018, 10, 85. [Google Scholar] [CrossRef] [Green Version]
Katuwal, R.; Suganthan, P.N. Stacked autoencoder based deep random vector functional link neural network for classification. Appl. Soft Comput. 2019, 85, 105854. [Google Scholar] [CrossRef] [Green Version]
Roy, S.S.; Hossain, S.I.; Akhand, M.A.H.; Murase, K. A robust system for noisy image classification combining denoising autoencoder and convolutional neural network. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 224–235. [Google Scholar] [CrossRef] [Green Version]
Shaukat, K.; Luo, S.; Chen, S.; Liu, D. Cyber Threat Detection Using Machine Learning Techniques: A Performance Evaluation Perspective. In Proceedings of the International Conference on Cyber Warfare and Security, Islamabad, Pakistan, 29 September–1 October 2020; pp. 1–6. [Google Scholar]
Shaukat, K.; Iqbal, F.; Alam, T.M.; Aujla, G.K.; Devnath, L.; Khan, A.G.; Iqbal, R.; Shahzadi, I.; Rubab, A. The Impact of Artificial intelligence and Robotics on the Future Employment Opportunities. Trends Comput. Sci. Inf. Technol. 2020, 5, 50–54. [Google Scholar] [CrossRef]
Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. Int. J. Data Min. Knowl. Manag. Process (IJDKP) 2015, 5, 1–11. [Google Scholar] [CrossRef]
Shaukat, K.; Luo, S.; Varadharajan, V.; Hameed, I.A.; Xu, M. A Survey on Machine Learning Techniques for Cyber Security in the Last Decade. IEEE Access 2020, 8, 222310–222354. [Google Scholar] [CrossRef]
Shaukat, K.; Luo, S.; Varadharajan, V.; Hameed, I.A.; Chen, S.; Liu, D.; Li, J. Performance Comparison and Current Challenges of Using Machine Learning Techniques in Cybersecurity. Energies 2020, 13, 2509. [Google Scholar] [CrossRef]
Wojdalski, J.; Krajnik, M.; Borowski, P.; Dróżdż, B.; Kupczyk, A. Energy and water efficiency in the gelatine production plant. AIMS Geosci. 2020, 6, 491–503. [Google Scholar] [CrossRef]
Griffith, R. Product Market Competition, Efficiency and Agency Cost: An Empirical Analysis; Institute for Fiscal Studies: London, UK, 2001. [Google Scholar]
Borowski, P.F. Adaptation strategy on regulated markets of power companies in Poland. Energy Environ. 2019, 30, 3–26. [Google Scholar] [CrossRef]
Fogarty, L.; Creanza, N. The niche construction of cultural complexity: Interactions between innovations, population size and the environment. Philos. Trans. R. Soc. A 2017, 372, 20160428. [Google Scholar] [CrossRef]
Borowski, P.F. New Technologies and Innovative Solutions in the Development Strategies of Energy Enterprises. HighTech Innov. J. 2020, 1, 39–58. [Google Scholar] [CrossRef]

Figure 1. Injection molding process.

Figure 2. Autoencoder explanation.

Figure 3. Feature importance of models.

Table 1. Variable description.

Variable Name (Unit)	Description
Injection_Time (s)	The time it takes the screw to move from the injection start position to the transfer position.
Filling_Time (s)	Filling time is an indication of how fast the plastic is injected into the mold.
Plasticizing_Time (s)	The time plasticizing the plastic.
Cycle_Time (s)	The amount of time it takes to start and end injection molding.
Clamp_Close_Time (s)	The time mold is closed.
Cushion_Position (mm)	The position of cushion after the mold filling and pack stages of the injection process.
Switch_Over_Position (mm)	The quality of the molded part is greatly influenced by the conditions under which it is processed.
Plasticizing_Position (mm)	Plasticizing position; during the cooling time, the molding machine begins plasticizing material in the barrel to prepare for the next cycle.
Clamp_Open_Position (mm)	Clamp position when clamping force is applied to a mold.
Max_Injection_Speed (mm/s)	Maximum injection speed when screw to push molten plastic resin into a mold cavity.
Max_Screw_RPM (RPM)	Maximum rpm when the screw rotation speed in plastic injection molding is the speed of rotations of the screw for mixing the pellets.
Average_Screw_RPM (RPM)	Average rpm when the screw rotation speed in plastic injection molding is the speed of rotations of the screw for mixing the pellets.
Max_Injection_Pressure (MPa)	Maximum injection pressure when screw to push molten plastic resin into a mold cavity.
Max_Switch_Over_Pressure (MPa)	Maximum pressure applied to switch over position.
Max_Back_Pressure (MPa)	Maximum pressure applied to back pressure.
Average_Back_Pressure (MPa)	Average pressure applied to back pressure.
Barrel_Temperature (°C)	The temperatures that need to be controlled during the plastic injection molding process about barrel temperature.
Mold_Temperature (°C)	Temperature of the actual mold cavity after it has stabilized.

Table 2. Summary statistics.

Panel A. Original data
	All Observations (n = 8149)		Good (n = 8024)		Defect (n = 125)		Difference in Means
Variables	Mean	Std.	Mean	Std.	Mean	Std.	t-Test
Injection_Time	9.5780	0.1384	9.5755	0.1156	9.7388	0.6070	3.0076 **
Filling_Time	4.4558	0.1177	4.4532	0.0891	4.6231	0.6067	3.1313 **
Plasticizing_Time	16.6920	0.2930	16.6906	0.2945	16.7842	0.1446	7.0205 ***
Cycle_Time	59.5273	0.2572	59.5254	0.2434	59.6491	0.7068	1.9555
Clamp_Close_Time	7.1163	0.0490	7.1162	0.0493	7.1232	0.0113	1.5765
Cushion_Position	653.4440	0.0770	653.4442	0.0774	653.4279	0.0396	2.3514 *
Plasticizing_Position	68.1365	0.5236	68.1351	0.5257	68.2250	0.3560	1.9040
Clamp_Open_Position	646.7125	27.1532	646.6926	27.3635	647.9900	0.0000	0.5301
Max_Injection_Speed	55.2213	1.2147	55.2365	1.1202	54.2440	3.8437	2.8850 **
Max_Screw_RPM	30.7456	0.1552	30.7458	0.1555	30.7328	0.1337	1.0761
Average_Screw_RPM	88.6168	110.0665	88.6565	110.0928	86.0656	108.7740	0.2611
Max_Injection_Pressure	142.1412	1.3046	142.1388	1.3078	142.2960	1.0683	1.6260
Max_Switch_Over_Pressure	136.1801	1.1938	136.1610	1.1610	137.4056	2.2146	6.2699 ***
Max_Back_Pressure	38.0278	1.3706	38.0073	1.1490	39.3416	6.0211	2.4769 *
Average_Back_Pressure	59.6506	2.3372	59.6349	2.2758	60.6600	4.7719	2.3975 *
Barrel_Temperature_1	276.0697	1.5231	276.0680	1.5339	276.1816	0.4332	0.8278
Barrel_Temperature_2	275.2125	1.2489	275.2129	1.2579	275.1896	0.3217	0.2070
Barrel_Temperature_3	274.9476	1.2199	274.9460	1.2287	275.0496	0.3222	0.9418
Barrel_Temperature_4	270.3618	1.4428	270.3602	1.4507	270.4672	0.7790	0.8228
Barrel_Temperature_5	254.9615	0.7733	254.9617	0.7772	254.9464	0.4665	0.2201
Barrel_Temperature_6	229.9891	0.3167	229.9896	0.3179	229.9576	0.2283	1.1208
Hopper_Temperature	66.2994	2.2812	66.2770	2.2858	67.7360	1.3322	11.9731 ***
Mold_Temperature_3	20.8714	2.7020	20.8306	2.6952	23.4888	1.6493	17.6564 ***
Mold_Temperature_4	22.0258	2.9231	21.9848	2.9179	24.6576	1.8725	15.6655 ***
Panel B. Oversampled data
	All Observations (n = 13,679)		Good (n = 8024)		Defect (n = 5655)		Difference in Means
Variables	Mean	Std.	Mean	Std.	Mean	Std.	t-Test
Injection_Time	9.6083	0.2672	9.5755	0.1156	9.6549	0.3874	14.9568 ***
Filling_Time	4.4888	0.2616	4.4532	0.0891	4.5394	0.3872	16.4397 ***
Plasticizing_Time	16.7333	0.2457	16.6906	0.2945	16.7939	0.1291	27.8563 ***
Cycle_Time	59.5481	0.3795	59.5254	0.2434	59.5804	0.5125	7.4944 ***
Clamp_Close_Time	7.1188	0.0382	7.1162	0.0493	7.1225	0.0076	11.1030 ***
Cushion_Position	653.4374	0.0634	653.4442	0.0774	653.4278	0.0325	15.0889 ***
Plasticizing_Position	68.1746	0.4648	68.1351	0.5257	68.2307	0.3539	12.7079 ***
Clamp_Open_Position	647.2289	20.9667	646.6926	27.3635	647.9900	0.0000	4.2471 ***
Max_Injection_Speed	55.0123	2.0042	55.2365	1.1202	54.6943	2.7864	13.8651 ***
Max_Screw_RPM	30.7365	0.1387	30.7458	0.1555	30.7233	0.1092	9.9524 ***
Average_Screw_RPM	89.0191	110.3241	88.6565	110.0928	89.5335	110.6593	0.4578
Max_Injection_Pressure	142.1346	1.0988	142.1388	1.3078	142.1286	0.7027	0.5910
Max_Switch_Over_Pressure	136.5476	1.4159	136.161	1.161	137.0961	1.5571	38.2802 ***
Max_Back_Pressure	38.3142	2.7955	38.0073	1.149	38.7496	4.0877	13.2907 ***
Average_Back_Pressure	59.8672	2.6853	59.6349	2.2758	60.1969	3.1480	11.4772 ***
Barrel_Temperature_1	276.1513	1.2000	276.068	1.5339	276.2694	0.3477	11.3594 ***
Barrel_Temperature_2	275.2030	0.9780	275.2129	1.2579	275.1891	0.2614	1.6483
Barrel_Temperature_3	275.0023	0.9620	274.946	1.2287	275.0820	0.2926	9.5385 ***
Barrel_Temperature_4	270.4701	1.1922	270.3602	1.4507	270.6261	0.6409	14.5323 ***
Barrel_Temperature_5	254.9644	0.6506	254.9617	0.7772	254.9682	0.4084	0.6298
Barrel_Temperature_6	229.9745	0.2763	229.9896	0.3179	229.9530	0.2011	8.2435 ***
Hopper_Temperature	66.9286	2.0686	66.277	2.2858	67.8531	1.2167	52.1636 ***
Mold_Temperature_3	22.0273	2.6498	20.8306	2.6952	23.7254	1.3275	82.9811 ***
Mold_Temperature_4	23.1947	2.8463	21.9848	2.9179	24.9114	1.5786	75.5214 **

Note: *, **, and *** refer to the statistical significance where p-value < 0.1, < 0.05, and < 0.01, respectively.

Table 3. Model results.

Panel A. Regression-based models
	Accuracy	Precision	Recall	F1-Score
Logistic Regression	0.8449	0.0833	0.8947	0.1521
Support Vector Machine	0.8642	0.0961	0.9210	0.1741
Panel B. Tree-based models
	Accuracy	Precision	Recall	F1-Score
Random Forest	0.9918	0.7647	0.6841	0.7222
Gradient Bootsing	0.9862	0.5576	0.7638	0.6444
XGBoost	0.989366	0.6761	0.6052	0.6388
CatBoost	0.9905	0.6923	0.7105	0.7012
LightGBM	0.9914	0.7575	0.6578	0.7042
Panel C. Autoencoder model
	Accuracy	Precision	Recall	F1-Score
Autoencoder	0.9959	0.9469	1.0000	0.9727

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jung, H.; Jeon, J.; Choi, D.; Park, J.-Y. Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry. Sustainability 2021, 13, 4120. https://doi.org/10.3390/su13084120

AMA Style

Jung H, Jeon J, Choi D, Park J-Y. Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry. Sustainability. 2021; 13(8):4120. https://doi.org/10.3390/su13084120

Chicago/Turabian Style

Jung, Hail, Jinsu Jeon, Dahui Choi, and Jung-Ywn Park. 2021. "Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry" Sustainability 13, no. 8: 4120. https://doi.org/10.3390/su13084120

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry

Abstract

1. Introduction

2. Literature Review

2.1. Injection Molding

2.2. Machine Learning

3. Data and Methodology

3.1. Data

3.2. Regression-Based Model

3.2.1. Logistic Regression

3.2.2. Support Vector Machine

3.3. Tree-Based Model

3.3.1. Random Forest

3.3.2. Gradient Boosting

3.3.3. XGBoost

3.3.4. LightGBM

3.3.5. CatBoost

3.4. Autoencoder-Based Model

3.5. Time Complexity and Model Evaluation

4. Main Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI