Sie können Operatoren mit Ihrer Suchanfrage kombinieren, um diese noch präziser einzugrenzen. Klicken Sie auf den Suchoperator, um eine Erklärung seiner Funktionsweise anzuzeigen.
Findet Dokumente, in denen beide Begriffe in beliebiger Reihenfolge innerhalb von maximal n Worten zueinander stehen. Empfehlung: Wählen Sie zwischen 15 und 30 als maximale Wortanzahl (z.B. NEAR(hybrid, antrieb, 20)).
Findet Dokumente, in denen der Begriff in Wortvarianten vorkommt, wobei diese VOR, HINTER oder VOR und HINTER dem Suchbegriff anschließen können (z.B., leichtbau*, *leichtbau, *leichtbau*).
Dieser Artikel geht auf die entscheidende Rolle der Lotpasteninspektion (SPI) im Herstellungsprozess von Leiterplatten ein und beleuchtet die Herausforderungen durch unausgewogene Daten und die Beschränkungen traditioneller automatisierter optischer Inspektionssysteme (AOI). Die Studie führt eine robuste Methode auf Basis des Langzeit-Kurzzeitgedächtnisses (LSTM) ein, um Defekte in der Lötpastenabscheidung zu identifizieren, einer entscheidenden Phase, in der 50% bis 70% der Leiterplattendefekte auftreten. Durch den Einsatz von Principal Component Analysis (PCA) zur Verringerung der Datendimensionalität beschleunigt die Forschung die Modellausbildung und steigert die Leistungsfähigkeit des LSTM Deep Learning Network. Um das Problem des Klassenungleichgewichts in Angriff zu nehmen, werden in der Studie grenzwertige synthetische Überstichproben von Minderheiten (SMOTE) verwendet, um zusätzliche Minderheitenfälle zu erzeugen, und Edited Nearest Neighbor (ENN), um falsch etikettierte Proben zu entfernen, wodurch ein ausgewogenerer und zuverlässigerer Datensatz gewährleistet wird. Das vorgeschlagene LSTM-basierte Identifikationsmodell zeigt eine überlegene Diskriminierungsfähigkeit, erreicht hohe echte negative und echte positive Raten und übertrifft traditionelle Ansätze des maschinellen Lernens und des tiefen Lernens. Eine Fallstudie mit 401.393 SPI-Proben eines taiwanesischen Elektronikunternehmens bestätigt die Wirksamkeit der Methode und zeigt ihr Potenzial zur Verbesserung der Produktionserträge und zur Senkung der Produktionskosten auf. Der Artikel schließt mit der Betonung der Bedeutung intelligenter Fertigung und der Implementierung von KI-Algorithmen zur Steigerung der Wettbewerbsfähigkeit in der Elektronikindustrie.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
Nowadays, the demand for high-quality electronic products has increased substantially, compelling printed circuit board (PCB) manufacturers to improve the quality of surface mount technology (SMT) processes. Due to the high sensitivity settings of solder paste inspection (SPI) machines, the number of samples classified as defective has significantly increased. As a result, quality inspectors must perform additional manual re-inspections, which increases labor costs and raises the risk of misclassification. To address this issue, this study integrates principal component analysis (PCA), the borderline synthetic minority over-sampling technique (SMOTE), the edited nearest neighbor (ENN) method, and the long short-term memory (LSTM) deep learning model to develop a robust methodology for identifying defective samples in the solder paste printing (SPP) stage. To facilitate more efficient data analysis, PCA is first employed to analyze the information content of the feature variables for SPI data, thereby extracting the most informative features. Because defective SPP samples are relatively scarce, the borderline SMOTE method is applied to generate additional minority-class observations. During the training process of the deep learning model, mislabeled data may adversely impact the classification accuracy of the proposed model. To mitigate this issue, the ENN method is utilized to remove potentially mislabeled samples. Once high-quality samples are obtained, an LSTM deep learning model is employed to establish a robust defect identification model for SPP samples to boost the PCB manufacturing yield. Finally, this study uses the SPI data to validates the proposed model, achieving an outstanding classification accuracy of 99.9%.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1 Introduction
In 2020, the global printed circuit board (PCB) industry benefited from the increase in remote working and learning. Additionally, the rapid development of fifth-generation mobile communication networks and the Internet of Things (IoT) has further stimulated the demand for consumer electronic products, including computers, communication devices, and gaming consoles. Surface-mount technology (SMT) is a crucial application during PCB manufacturing process. PCB assembly typically comprises five stages, namely solder paste printing (SPP), solder paste inspection (SPI), chip placement, reflow, and automated optical inspection (AOI). During the assembly process, approximately 50%–70% of circuit board defects occur in the SPP stage [1‐3].
However, AOI systems can only detect defects such as air welding, solder bridges on PCBs, and missing electronic components. They are not effective in identifying defects arising from the SPP process, which often leads to high misjudgment rates and consequently reduces the overall quality of PCB manufacturing. Therefore, SPI plays a crucial role in the inspection process. Given that PCB clients generally demand extremely high-quality standards, manufacturers place great emphasis on screening out potentially faulty boards. In practice, they prefer to accept a higher false-alarm rate rather than allow any defective products to be delivered to their customers. As a result, PCB manufacturers continuously seek to improve the production yield.
Anzeige
Owing to the rapid development of IoT and broadband Internet technology, real-time big data generated by equipment in manufacturing systems can be effectively collected, stored, analyzed, and applied to realize the vision of smart manufacturing. Big data analysis is used to rapidly identify undetected problems in a manufacturing system. Such analysis also facilitates the early detection of these problems and provides manufacturing engineers with the essential decision-making information for problem-solving in manufacturing systems. This, in turn, enables the continuous system optimization, improves product quality, and reduces production costs. So far, PCB manufacturers have installed IoT applications in manufacturing systems to collect big data from SPI machines. To accelerate data analysis, this research utilizes principal component analysis (PCA) to reduce the dimensionality of the SPI data. In addition, PCB manufacturers generally employ high-sensitivity settings for SPI machines to avoid delivering defective products to clients. However, this practice has greatly increased the number of products flagged as defective. Hence, professional quality inspectors on the production line must manually determine whether these products are truly defective. This results in manufacturing companies allocating quality inspectors to repeatedly inspect these products, which increases labor costs and decreases production efficiency. In addition, during the manual inspection process, professional quality inspectors may also misjudge whether a product is defective. As a consequence, clients may still receive defective products. Because clients in the electronics industry impose stringent quality requirements on PCBs, manufacturers strive to minimize defect rates. Consequently, this research integrates data generation and dimensionality reduction techniques with a long short-term memory (LSTM) deep learning network to identify detective electronic parts.
Firstly, this study performs PCA to extract the feature variables of SPI data for reducing data dimensionality and removing less significant feature variables. In PCB assembly, defects arising from the solder paste deposition primarily encompass excessive or insufficient solder paste deposition, misalignment of solder paste, and inappropriate solder paste height. This study investigates the feature variables influencing these defects.
Moreover, defective samples are rare in the PCB industry, resulting in a highly imbalanced dataset. Therefore, this research employs the borderline synthetic minority over-sampling technique (SMOTE) to generate additional minority instances and thereby alleviate the class imbalance problem. Misjudgment by quality inspectors during the manual detection of defective products may result in the incorrect data labels. When training a deep learning model, the quality of the training data is crucial because even a small number of mislabeled samples may significantly decrease prediction accuracy. Hence, the Edited Nearest Neighbor (ENN) method is used to remove suspiciously misjudged samples, thus preventing mislabeled samples from being used to train the deep learning model. Subsequently, the LSTM deep learning network is employed to construct a robust defect identification model for SPP. The proposed model can reduce the mental workload associated with repetitive manual reinspection and improve the manufacturing yield of PCBs.
2 Literature review
Regarding the key factors affecting SPP quality, Huang et al. [2] established the criteria for SPI. Their study indicated that increasing the printing pressure results in fewer solder paste deposits. In addition, the configuration of the soldering fixture determines the movement direction of the squeegee blade, which leads to considerable differences in the number of solder paste deposits. Rahman et al. [4] employed the Taguchi method to determine the optimal movement speed and pressure of the squeegee blade for different solder printers. Yu et al. [5] employed discrete-time signals and the discrete-state Markov chain to develop a performance-declining model of squeegee blades. The proposed model helps determine the optimal cleaning duration for the squeegee blade and identify a favorable balance between quality loss and idle time loss. To improve the manufacturing process of PCB assembly, Fung and Yung [6] integrated K-means clustering into the multi-response Taguchi method to screen for key features in the SMT manufacturing process and to optimize the process parameters, including squeegee blade pressure, speed, angle, and cleaning duration. Herchenbach et al. [7] developed an AI-based adaptive causal control system to identify optimal offline and online printing parameters. Overall, these studies focused on tuning SMT process parameters to reduce solder paste defects.
Anzeige
Paulo et al. [8] used the low-cost Raspberry Pi compute module and a camera to construct an optical SPI system to determine the appropriate amount of solder paste. However, the proposed system could not detect other types of solder paste defects. Zeng et al. [9] used curvatures and geometric parameters to analyze the three-dimensional shapes of soldering points. In their study, experts first defined the rules for detecting defects on the basis of their practical experience before detecting defects in the soldering points. Some scholars have used scientific data analysis to define rules for detecting SPP defects. Yuk et al. [10] employed the speeded-up robust features to extract the fault patterns from 10 scratched circuit boards and computed the probability of the pattern being defective. Subsequently, they used the random forest method to distinguish qualified and defective PCBs. Most PCBs are qualified. However, the classification performance of the random forest method significantly decreased when it processed extremely imbalanced data. Therefore, the PCB defective detection of the random forest method cannot satisfy industrial requirements. In the AOI step, Dai et al. [11] employed the deep convolutional neural network to locate the soldering points for various configurations of PCBs. Through the clustering-based method, the system selected highly homogeneous clusters with a high-level of confidence. The SVM classifier, trained on a limited set of labeled data, is employed to predict the classes for all unlabeled soldering points within these clusters. Next, the automatically labeled samples are incorporated into the manually annotated dataset to retrain the current SVM classifier. Due to the inherent inaccuracy of the K-means algorithm, it could introduce the incorrectly labeled samples into the training dataset to potentially degrading the SVM classifier’s performance.
Li et al. [12] applied single-shot multi-box detection to label the type and position of objects in images. They used the convolutional neural network (CNN) to analyze the types of holes on PCBs. However, this method cannot be used to detect PCB defects. Volkau et al. [13] employed the VGG16 architecture [14], which is based on transfer learning, to extract features from image samples of qualified PCBs. That study also adopted unsupervised representation learning to establish a one-class decision boundary. Samples of PCB images that exceeded the boundary were considered anomalies. The current model relies on a global threshold that is set manually to distinguish between normal and defective samples. The authors acknowledge that their approach lacks flexibility and an adaptive threshold.
Because the defect detection performance of AOI systems does not meet the industry standards, Mujeeb et al. [15] employed the deep autoencoder to develop a one-class feature learning method to enable the AOI system to correctly detect defect-free SPP images. To enable the AOI system to rapidly detect defect-free SPP images, the authors selected a golden sample and used data augmentation to generate training samples for the deep learning model. They then trained the model to correctly detect defect-free samples. However, because this method only uses one golden sample to generate the training samples, the trained neural network may not be able to correctly screen for other types of defect-free SPP samples. Sezer and Altan [16] combined a population-based optimization algorithm with a convolutional neural network to identify defects in SPP images. That algorithm was trained and tested on a relatively small dataset with balanced classes. Consequently, the dataset may not fully capture the diversity and complexity encountered in real production environments. Xia et al. [17] employed a region-based fully convolutional network and a focal loss function to detect minor defects on circuit boards. The focal loss function assigns more weight to hard samples to improve the detection of minor defects in circuit boards. They addressed the scarcity of minor defect samples by employing a data enhancement technique in which defective regions are randomly pasted onto different positions of the PCBs. However, such randomly generated samples are not guided by the decision boundary, which may weaken the model’s discriminative capability.
The above-mentioned methods are applicable in the AOI stage but cannot be used for the early-stage defect detection in the SMT manufacturing process. Therefore, studies have also proposed defect detection methods for the SPP phase. SPI machines typically assume that the volumes of solder paste deposits follow the normal distribution and then computes their mean and the variance. If the volume of a solder paste deposit differs considerably from the mean value, the SPI machine classifies that deposit as defective. However, this method has difficulty detecting PCB defects produced by SPP. Therefore, Yoo et al. [3] applied a convolutional recurrent reconstructive network (CRRN) to analyze the patterns of solder paste volumes under different temporal and spatial conditions. This method does not require the assumption of a normal distribution. The temporal condition depends on the cleaning cycle of the stencil and the movement direction of the squeegee blade. The spatial condition is determined by the positions of the soldering pads and the solder paste volumes. Most defects in SMT processes are caused by poor SPP quality. SPI machines mainly monitor the solder paste deposits. The threshold values of solder paste deposits are based on statistical analysis and expert experience. Consequently, SPI machines cannot effectively detect these defects.
Wei et al. [18] proposed the automatic threshold adjustment algorithm that employs density-based spatial clustering of applications with noise to identify arbitrarily shaped clusters on the basis of each feature of solder paste deposits, including the volume, area, and height. The largest cluster of each feature is identified and considered to represent defect-free SPP. On the basis of the three largest clusters formed in each feature, the upper and lower limits of the volume, area, and height of a solder paste deposit can be determined and used to initially screen out defective solder paste deposits. After the solder paste deposits are processed by the reflow oven, AOI is conducted, and experts manually determine whether they are defect-free and establish the labelled data. Subsequently, the SVM model is trained, and the upper and lower limits of the SPP features are updated. However, if two or more clusters identified by using the density-based spatial clustering of applications are misclassified as defect-free SPP, the upper and lower limits of the SPP features may be misjudged. Such misjudgment may lead to an increase in the number of defective products.
To increase AOI accuracy, Chang et al. [19] employed a Sigmoid function to establish a three-layer neural network. The feature variables of the input layer of the neural network comprised the volume, area, height, and offset of a solder paste deposit. The customized demand for PCBs has increased the variety of PCBs and decreased the production quantity of each type of PCB. Therefore, constructing an anomaly detection model of SPP for every PCB type is difficult. To overcome the insufficient number of training samples for the SPI data for customized PCBs, Zheng [20] employed the K-means clustering method to cluster SPP tasks with high similarity. Within each cluster, the same anomaly detection model can be used, and a small number of training samples are applied to adjust the model parameters to increase its accuracy. The feature variables of the model comprise the volume, area, and height of a solder paste deposit. That study compared the classification performance of isolation forest, XGBoost, random forest, and the neural network. Isolation forest demonstrated the highest F1 score. Isolation forest is an unsupervised algorithm used in global anomaly detection. If anomalies occur only in localized regions, isolated forest is less suitable for detection. This limitation is reflected in its accuracy for detecting solder paste defects. The detection accuracy of isolated forest was only 0.8, which cannot satisfy the quality requirements of the electronics industry. Hence, this research proposes a robust LSTM-based defect identification methodology to solve the above-mentioned issues.
Based on the literature review, few studies simultaneously consider both training efficiency and identification performance in solder paste defect detection models. To address the research gap, this study applies PCA to reduce data dimensionality, thereby shortening the training and validation time of artificial intelligence models. This facilitates the rapid deployment of intelligent solder paste defect detection systems on SMT production lines. Although several studies employ clustering-based approaches to reduce the time and labor required for data annotation, the class imbalance problem remains unresolved, which degrades the classification performance of machine learning models. In addition, noise samples located near the decision boundary may adversely affect the classification accuracy and robustness of AI model training. Therefore, this study adopts the ENN technique to remove noisy samples, which has not been considered in previous studies for SPI. Regarding classification models, most prior studies rely on decision trees, SVMs, and CNNs. To further enhance classification performance, this study integrates borderline SMOTE and ENN to propose the LSTM-based defect identification model. This model can effectively improve the detection of defective samples and prevent defective components from entering subsequent manufacturing processes.
3 Methodology
Figure 1 presents the framework of robust LSTM-based defect identification methodology for imbalanced data of solder paste inspection. The research comprises three parts: (1) the key-feature extraction model for SPI data, (2) a borderline SMOTE and ENN based balanced data generation model, and (3) an LSTM-based identification model for SPP defects. The following subsection describes each part of the proposed methodology in detail. The first part uses PCA to develop the key-feature extraction model for SPI data to remove redundant information from the SPI dataset, accelerate model training, and enhance the performance of the LSTM deep learning network. Through this key-feature extraction model, a reduced-dimensional SPI dataset is obtained. In practice, the defect rate of SPP is very low, resulting in a severe shortage of defective samples. Such data Imbalance substantially degrades the training performance of deep learning models and therefore needs to be addressed. To improve the binary classification ability of the deep learning model, this research adopts the borderline SMOTE to oversample instances near the decision boundary and generate informative minority-class samples. Moreover, the electronics industry requests extremely high yield rates for electronic components and does not allow defective products to be shipped. However, the classification results generated by the SPI machine may contain misjudgment. To alleviate this issue, this study removes potential majority-class outliers that may actually be defective but are labeled as non-defective. Therefore, Edited Nearest Neighbor (ENN) is used to identify suspicious instances near the decision boundary and eliminate them, thereby improving the ability to identify defective electronic parts. After obtaining the balanced and screened samples, the third part employs the lower-dimensional SPI dataset to create the LSTM-based identification model. The model prevents defects from entering subsequent manufacturing processes, which would lead to higher defect rates and increased PCB manufacturing costs.to identify SPP defects at early stages of assembly.
Fig. 1
Framework of robust LSTM-based defect identification methodology for imbalanced data of solder paste inspection
In the first part, this study employs PCA to develop a key-feature extracting model for SPI data to reduce the dimensionality of the original dataset. For subsequent analysis, the feature values of SPI data are centered by subtracting their corresponding expected values, so that each feature has an expected value of zero. Next, the model projects the centered SPI data onto a set of orthogonal axes in the original feature space and evaluates the information content of each axis by estimating its variance. The number of axes is equal to the number of original dimensions in SPI data. The first axis is called the first principal component, and so on. When all the principal components have been obtained, we can project SPI data down to a lower-dimensional hyperplane while retaining the overall data structure and variance pattern. Hence, this model can identify the best hyperplane to ensure that the projection preserves as much information as possible. Equations 1 and 2 are used to reduce the dimensionality of SPI data from a dth dimensional space to a rth dimensional subspace by determining the projection vectors associated with the largest eigenvalues of the covariance matrix.
The method of Lagrange multipliers is used to convert the constraints into a solvable form and determine the optimal solution, as shown in Eq. 3. After the first principal component is obtained, the process is repeated to obtain subsequent principal components. Because a low principal component value indicates that it carries little information, this study identifies and removes redundant principal components.
3.2 Borderline SMOTE and ENN based balanced SPI data generation model
To satisfy the demand for electronic products with high reliability among contemporary consumers, electronic component manufacturers must ensure that the components they provide to downstream system assembly plants are qualified. Therefore, the quality control of PCB assembly processes increasingly relies on accurate of SPI to identify and remove defective electronic components from the production line. Samples located near the decision boundary are more difficult to classify, whereas samples far from the boundary contribute little to classifier learning [21]. Therefore, this study generates additional training data in the vicinity of the decision boundary. To fulfill industrial demand, this study applies the borderline SMOTE algorithm combined with ENN to develop a balanced SPI data generation model.
First, this paper uses borderline SMOTE to oversample defective samples near the decision boundary in order to improve the extremely imbalanced SPI data and to boost the identification rate of the minority class. To generate synthetic minority instances, borderline SMOTE creates new samples along the line segments between borderline minority samples and their selected nearest neighbors within the minority class. The procedure of borderline SMOTE is as follows [21]:
Step 1. For the minority-class sample \(\:{i}_{k}\), the m nearest neighbors are identified by calculating the distances between \(\:{i}_{k}\) and the all samples in the training set. The number for majority-class samples among these m nearest neighbors is denoted by t.
Step 2. If m = t, all m nearest neighbors belong to majority class, \(\:{i}_{k}\) is considered noise. If \(\:m/2\le\:t<m\), the number of majority-class samples among the m nearest neighbors is larger than the number of minority-class samples. These minority samples are easily misclassified. Therefore, they are assigned to the danger set. If \(\:0\le\:t<m/2\), it indicates that \(\:{i}_{k}\) is easily classified. Hence, the sample is assigned to the safe set.
Step 3. The samples in the danger set are treated as borderline data of the minority class. For each sample in the danger set, the k nearest neighbors belonging to the minority class are identified.
Step 4. For each sample in the danger set, P synthetic samples are generated. Let \(\:{d}_{i}\) denote the ith dangerous sample. For each \(\:{d}_{i}\), p neighbors are randomly selected from its r nearest minority-class neighbors. Next, the differences between \(\:{d}_{i}\) and its p nearest neighbors are computed. Equation 4 is used to generate p synthetic samples.
\(gen_{i,\;p}\) : the p-th synthetic minority sample generated for the ith dangerous sample.
\(random_{i,\;p}\) : a random number between 0 and 1 corresponding to the pth synthetic minority sample for the ith dangerous sample.
\(dif_{i,\;p}\) : the differences between and its pth selected neighbors.
Because the misclassification of truly defective electronic parts is extremely costly, this research removes noisy samples from the majority class samples. Consequently, we utilizes the ENN algorithm to eliminate potentially mislabeled samples from the majority class. The ENN procedure is summarized as follows [22]:
Step 1. For each \(\:{a}_{i}\) in the majority class, identify its K-nearest neighbors, where K is usually set to 3.
Step 2. Determine the majority class of the K-nearest neighbors.
Step 3. Compare this class label with that of \(\:{a}_{i}\).
Step 4. If they are different, remove \(\:{a}_{i}\) and its K-nearest neighbors from the dataset.
Step 5. Repeat Steps 1–4 until all majority-class samples have been examined.
3.3 LSTM based identification model for SPP defects
In the third part of this study, an LSTM deep learning network is employed to develop the identification model for SPP defects. Figure 2 illustrates the architecture of the proposed LSTM-based identification model for SPP defects. This model adopts a many-to-one framework. The feature variables used for SPP defect identification serve as the input vector, whereas the output is a single binary variable, denoted by \(y_t\) . When the output is 0, the model classifies the solder paste deposit as defect-free in the SMT manufacturing process. Conversely, when the output is 1, the model classifies the solder paste deposit as defective. The LSTM based identification model for SPP defects comprises three gates: the input gate, the forget gate, and the output gate. The input gate specifies how much new information is written to the memory cell. The forget gate determines how much historical information is retained. The output gate controls which information is passed from the LSTM deep learning model to the next layer. All gates use the sigmoid function, which produces values between 0 and 1. When the output of a sigmoid function is close to 1, the corresponding gate is effectively opened. In contrast, when it is close to 0, the gate is effectively closed. The input vector of the identification model for SPP defects together with the output from the previous time step are fed to \(\:{g}_{t}\), i.e., the tanh activation function, to produce the short-term state. Equations 5–10 summarize the formulation of the LSTM-based identification model for SPP defects based on the LSTM deep learning model [23].
Fig. 2
Architecture of the proposed LSTM-based identification model for SPP defects
\(x_t\) : the input vector to the identification model at time t;
\(y_t,\;h_t\) : the output state of the identification model at time t;
\(g_t\) : the aggregated vector constructed from the input vector at time t and the output state at time \(t-1\) of the identification model.
\(i_t\) : the degree to which the input gate is opened, determined by the input vector at time t and the output state at time \(t-1\) of the identification model;
\(f_t\) : the degree to which the forget gate is opened, determined by the input vector at time t and the output state at time \(t-1\) of the identification model;
\(c_t\) : the information stored in the memory cell at time t;
\(o_t\) : the degree to which the output gate is opened, determined by the input vector to the identification model at time t and the output state at time \(t-1\);
\(b_i,\;b_f,\;b_o,\;b_g\) : bias vectors for the input gate, the forget gate, the output gate, and the input layer, respectively;
\(w_{xi\;}^T,\;w_{xf\;}^T,\;w_{xo\;}^T,\;w_{xg\;}^T\) : weight matrices between the input gate and input vector, between the forget gate and input vector, between the output gate and input vector, and between the input layer and input vector, respectively;
\(w_{hi\;}^T,\;w_{hf\;}^T,\;w_{xo\;}^T,\;w_{hg\;}^T\) : weight matrices between the input gate and output at time t − 1, between the forget gate and output at time t − 1, between the output gate and output at time t − 1, and between the input layer and output at time t − 1.
4 Case study
This research collected 401,393 SPI samples from an electronics company in Taiwan, of which only 1,393 were non-defective. The SPI machine automatically inspects each sample and assign it a label of either non-defective or defective. To prevent defective samples from entering subsequent manufacturing processes, the SPI system is configured with a high sensitivity level. However, this high sensitivity inevitably generates pseudo-defective labels. Therefore, inspection engineers manually re-evaluate these samples and assign the final verified labels.
The SPI dataset contains five feature variables. The first three feature variables are the volume, area, and thickness of a solder paste deposit. The fourth and fifth variables are the x-axis offset and the y-axis offset for the location of the solder paste deposit. Owing to the different units and scales of these feature variables, we firstly used the min-max normalization technique to scale all feature values to the range of 0 to 1. To decrease the computational time during the model training and testing while retaining as much important information as possible, the key-feature extracting model is applied to analyze the principal components to reduce the dimensionality of SPI data. The explained variance ratios of five principal components are 0.9390, 0.0319, 0.0197, 0.0075, 0.0028, separately. Because the fifth principal component has the lowest explained variance ratio, i.e., 0.0028, it contributes minimal information. Hence, this study reduces data dimensionality to obtain the four-dimensional SPI dataset. To assess the effect of principal component analysis on computational efficiency, all experiments were conducted on a laptop computer with an Intel Core i7-8700 CPU, 8 GB of RAM, and an NVIDIA RTX 4070 GPU with 16 GB of Video Random Access Memory (VRAM). The system operated on Windows 11 (64-bit), and all deep learning models were implemented using Python 3.9.18 and TensorFlow 2.1. We compared the computation times of the original and reduced-dimensional data, which resulted in a reduction of 11.08%.
In this case, the defective solder paste deposits belong to the minority class. Owing to the severe data imbalance affecting the training performance of the identification model for SPP defects, this research employed borderline SMOTE to augment the defective solder paste deposits to achieve the balanced data. Next, the ENN algorithm was used to remove potentially mislabeled samples, thereby decrease misclassification of solder paste deposits. The defective samples and non-defective samples are 399,104 and 399,909, respectively. Then, the lower-dimensional balanced SPI dataset was then split into 80% training samples and 20% testing samples. The LSTM network was utilized to create the identification model for SPP defects. Table 1 summarizes the key parameter settings used for training the LSTM-based identification model for SPP defects. The model was trained for 20 epochs using the Adam optimizer. The binary cross-entropy loss function was employed to measure the discrepancy between predicted and true labels, and a batch size of 32 was applied during the training stage.
Table 1
Key parameter setting of the identification model for SPP defects
Parameters
Value
Epoch
20
Optimizer
Adam
Loss function
Binary cross-entropy loss
Batch
32
Figure 3 illustrates the training and validation loss trajectories across 20 epochs. The training loss decreases sharply during the initial epochs and subsequently converges smoothly toward a low and stable value, indicating effective optimization. The validation loss exhibits a comparable downward trend with only minor fluctuations. In parallel, both training and validation accuracies exceed 0.99, confirming generalization capability, as shown in Fig. 4. Figure 5 presents the confusion matrix of the identification model for SPP defects. The model correctly classifies the vast majority of both non-defective (label 0) and defective (label 1) samples, achieving true negative and true positive rates of 0.9967 and 0.9969, respectively. Only a small proportion of samples are misclassified, with false positive and false negative rates of 0.0031 and 0.0033. Without applying the borderline SMOTE and ENN techniques, the LSTM model attains a true negative rate of only 0.7735, as shown in Table 2. These results demonstrate that the proposed model exhibits highly reliable discrimination capability across both classes.
Fig. 3
Training and validation loss curves of the identification model for SPP defects
Performance comparison of the proposed method with existing classification models
Classification
Precision (%)
Recall (%)
f1-score (%)
AUC (%)
Random Forest
non-defective
99.93
99.99
99.96
89.72
defective
97.02
79.44
87.36
CNN
non-defective
99.95
99.97
99.96
98.43
defective
85.90
78.82
82.21
GRU
non-defective
99.93
99.96
99.95
97.24
defective
88.12
83.94
83.94
LSTM
non-defective
99.92
99.98
99.95
96.49
defective
94.87
77.35
85.22
The proposed method
non-defective
99.68
99.67
99.67
99.99
defective
99.67
99. 69
99.68
Table 2 summarizes the performance of the proposed identification model in comparison with four widely used classifiers, including Random Forest, Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU), and LSTM. Across all metrics, the proposed method consistently outperforms the benchmark models, particularly in identifying defective samples, which represent the minority and more challenging class. For the non-defective class, all models exhibit high precision, recall, and F1-scores, indicating that distinguishing non-defective samples is relatively straightforward across traditional machine learning and deep learning approaches. However, substantial performance differences emerge in the defective class. Random Forest achieves an F1-score of 0.8736, while CNN, GRU, and LSTM show moderate improvements with F1-scores of 0.8221, 0.8394, and 0.8522 respectively. In contrast, the proposed method attains an F1-score of 0.9968 for defective samples, representing a significant enhancement in minority-class recognition. A similar trend is observed in terms of Area Under the Receiver Operating Characteristic (ROC) Curve (AUC). The proposed model achieves an AUC of 0.9999, surpassing CNN (0.9843), GRU (0.9724), LSTM (96.49) and Random Forest (0.8972). With regard to precision and recall for the defective class, the proposed method also outperforms other models. This substantial improvement highlights the superior discrimination capability of the proposed approach under highly imbalanced conditions.
Table 3 compares the classification performance of the proposed method with representative studies in the literature. As shown in the table, the proposed method achieves a precision of 99.67% and an accuracy of 99.68%, outperforming all selected benchmark approaches. Dai et al. (2020) and Zeng et al. (2021) reported precision values of 95.16% and 98.3%, respectively. In contrast, Kong et al. [24] presented an accuracy of 90.33%. Overall, the superior performance of the proposed method highlights its effectiveness in classification tasks compared with existing papers.
Table 3
Classification accuracy of the proposed method in comparison with other papers
Due to the rapid development of AI algorithms and the significant improvement in computing power, the implementation of smart manufacturing to boost the competitiveness has become critical issue. Through Industrial Internet of Things (IIoT) devices, PCB manufacturers can collect large volumes of manufacturing data at low cost. Based on consumer demand, the high reliability of electronic products is a key factor influencing purchasing decisions. Therefore, system integrators have stringent quality requirements for electronic component suppliers. To meet this challenge, companies need to strengthen the defect-identification capability of the SMT process. Hence, this research proposes the robust identification methodology for SPP defects. First, PCA is used to reduce the dimensionality of SPI data by removing the low-information feature, thereby speeding up the analysis. To address the common problem of data imbalance in the electronics industry, this study employs the borderline SMOTE technique to generate defective samples, preventing the deep learning model from being biased toward non-defective samples when building a binary classification model. In addition, the ENN method is used to eliminate the misclassified instances to obtain the more balanced and reliable training and testing dataset. Finally, a robust identification model for SPP defects by using LSTM deep learning model is created to enable PCB manufacturers to deliver qualified components to system integrators.
Declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Biemans H (2011) 5D solder paste inspection merits beyond 3D technology. Gobal SMT Packaging 11:8–13
2.
Huang CY, Lin YH, Ying KC, Ku CL (2011) The solder paste printing process: critical parameters, defect scenarios, specifications, and cost reduction. Solder Surf Mt Technol 23:211–223CrossRef
3.
Yoo YH, Kim UH, Kim JH (2020) Convolutional recurrent reconstructive network for spatiotemporal anomaly detection in solder paste inspection. IEEE Trans Cybern 52:4688–4700CrossRef
4.
Rahman MNA, Zubir NSM, Leuveano RAC, Ghani JA, Mahmood WMFW (2014) Reliability study of solder paste alloy for the improvement of solder joint at surface mount fine-pitch components. Materials 7:7706–7721CrossRef
5.
Yu J, Cao L, Fu H, Guo J (2019) A method for optimizing stencil cleaning time in solder paste printing process. Solder Surf Mt Technol 31:233–239CrossRef
6.
Fung VWC, Yung KC (2020) An intelligent approach for improving printed circuit board assembly process performance in smart manufacturing. Int J Eng Bus Manag 12:1–12CrossRef
7.
Herchenbach M, Weinzierl S, Zilker S, Schwulera E, Matzner M (2025) A methodology for adaptive AI-based causal control: toward an autonomous factory in solder paste printing. Comput Ind 167:104256CrossRef
8.
Paulo GBF, Perdigoto LMR, Faria SMM (2021) Automatic visual inspection system for solder paste deposition in PCBs. Telecoms Conference (ConfTELE), Leiria, Portugal, pp 1–6
9.
Zeng Y, HuY, Zhang X, Luo Z, Wei X (2021) A novel solder joints inspection method using curvature and geometry features in high-density flexible IC substrates surface Mount technology. Phys Scr 96:125528CrossRef
10.
Yuk EH, Park SH, Park CS, Baek JG (2018) Feature-learning-based printed circuit board inspection via speeded-up robust features and random forest. Appl Sci 8:932CrossRef
11.
Dai W, Mujeeb A, Erdt M, Sourin A (2020) Soldering defect detection in automatic optical inspection. Adv Eng Inf 43:1–7CrossRef
12.
Li D, Xu L, Ran G, Guo Z (2021) Computer vision based research on PCB recognition using SSD neural network. J Phys : Conf Ser 1815:012005
13.
Volkau I, Mujeeb A, Wenting D, Marius E, Alexei S (2019) Detection defect in printed circuit boards using unsupervised feature extraction upon transfer learning, International Conference on Cyberworlds, Kyoto, Japan, pp 101–108
14.
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations (ICLR 2015), San Diego, USA, pp 1–14
15.
Mujeeb A, Dai W, Erdt M, Sourin A (2019) One class based feature learning approach for defect detection using deep autoencoders. Adv Eng Inform 42:100933CrossRef
16.
Sezer A, Altan A (2021) Optimization of deep learning model parameters in classification of solder paste defects. International Congress on Human-Computer Interaction, optimization and robotic applications (HORA). Ankara, Turkey, pp 1–6
17.
Xia S, Wang F, Xie F, Huang L, Wang Q, Ling X (2021) An efficient and robust target detection algorithm for identifying minor defects of printed circuit board based on PHFE and FL-RFCN. Front Phys 9:661091CrossRef
18.
Wei CC, Hsieh P, Chen J (2019) Automatic adjustment of thresholds via closed-loop feedback mechanism for solder paste inspection. World Acad Sci Eng Technol 13:726–730
19.
Chang YM, Wei CC, Chen J, Hsieh P (2018) Classification of solder joints via automatic mistake reduction system for improvement of AOI inspection, International Microsystems, Packaging, Assembly and Circuits Technology Conference (IMPACT), Taipei, Taiwan, pp 24–26
20.
Zheng Z, Pu J, Liu L, Wang D, Mei X, Zhang S, Dai Q (2020) Contextual anomaly detection in solder paste inspection with multi-task learning. ACM Trans Intell Syst Technol 11:1–17CrossRef
21.
Han H, Wang W, Mao B (2005) Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning, Proceedings of the 2005 International Conference on Advances in Intelligent Computing, Hefei, China, pp 878–887
22.
Wilson DL (1972) Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans Syst Man Cybern Syst SMC-2(3):408–421MathSciNetCrossRef
Kong D, Hu X, Zhang J, Liu X, Zhang D (2024) Design of intelligent inspection system for solder paste printing defects based on improved YOLOX. iScience 27:109147CrossRef
Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.