Skip to main content
Erschienen in: Integrating Materials and Manufacturing Innovation 2/2022

14.03.2022 | Technical Article

Self-Supervised Deep Hadamard Autoencoders for Treating Missing Data: A Case Study in Manufacturing

verfasst von: Rasika Karkare, Randy Paffenroth, Diran Apelian

Erschienen in: Integrating Materials and Manufacturing Innovation | Ausgabe 2/2022

Einloggen, um Zugang zu erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Data collected from sensors are pivotal to the Industrial Internet of Things (IIoT) applications as they would not be usable if the data quality is bad. Missing values commonly appear in industrial datasets due to several reasons. In particular, unexpected changes in the process, failure of the sensors that are recording the measurements due to network connectivity or hardware issues, censored/anonymous features and human error are some of the reasons for missing values in these datasets. In the manufacturing context, it is expensive and time consuming to collect all the different types of data related to a part leading to siloed and incomplete datasets. Due to characteristics such as presence of high missingness ratios, varying patterns of missingness, class imbalance and small-sized datasets, approaches such as single imputation or dropping the rows with missing values as is done in complete case analysis would result in loss of information and can also introduce a bias in the analysis depending on the underlying mechanism of missingness. The approach used for treatment of missing values must be carefully chosen based on the characteristics of the data at hand as the downstream generalization performance of algorithms trained on the imputed data would directly depend on the quality of imputation. In this work, we present a novel approach to treating missing values in a real-world dataset. Our approach uses deep Hadamard autoencoders (HAE) in addition to the self-supervised learning paradigm and shows better imputation performance on the unobserved entries of the test set as compared to approaches such as single imputation using mean, or other methods such as standard autoencoders (SAE). We demonstrate the effectiveness of the proposed method using a case study on a high-pressure die casting dataset (HPDC).
Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Github: https://github.com/rskarkare/Self-supervised-Hadamard-Autoencoders
 
Literatur
1.
Zurück zum Zitat Schwab K, Davis N, Nadella S (2018) Shaping the future of the fourth industrial revolution. World Economic Forum, Geneva Schwab K, Davis N, Nadella S (2018) Shaping the future of the fourth industrial revolution. World Economic Forum, Geneva
2.
Zurück zum Zitat Hétu J-F, Gao DM, Kabanemi KK, Bergeron S, Nguyen KT, Loong CA (1998) Numerical modeling of casting processes. Adv Perform Mater 5(1):65–82CrossRef Hétu J-F, Gao DM, Kabanemi KK, Bergeron S, Nguyen KT, Loong CA (1998) Numerical modeling of casting processes. Adv Perform Mater 5(1):65–82CrossRef
4.
Zurück zum Zitat Kwak D-S, Kim K-J (2012) A data mining approach considering missing values for the optimization of semiconductor-manufacturing processes. Expert Syst Appl 39(3):2590–2596CrossRef Kwak D-S, Kim K-J (2012) A data mining approach considering missing values for the optimization of semiconductor-manufacturing processes. Expert Syst Appl 39(3):2590–2596CrossRef
5.
Zurück zum Zitat Kang S (2021) Product failure prediction with missing data using graph neural networks. Neural Comput Appl 33(12):7225–7234CrossRef Kang S (2021) Product failure prediction with missing data using graph neural networks. Neural Comput Appl 33(12):7225–7234CrossRef
6.
Zurück zum Zitat Christian JJ, Gluud C, Wetterslev J, Winkel P (2017) When and how should multiple imputation be used for handling missing data in randomised clinical trials-a practical guide with flowcharts. BMC Med Res Methodol 17(1):1–10CrossRef Christian JJ, Gluud C, Wetterslev J, Winkel P (2017) When and how should multiple imputation be used for handling missing data in randomised clinical trials-a practical guide with flowcharts. BMC Med Res Methodol 17(1):1–10CrossRef
7.
Zurück zum Zitat Frase J (2020) The challenge of Manufacturing Data Management Frase J (2020) The challenge of Manufacturing Data Management
8.
Zurück zum Zitat Karkare R, Paffenroth RC, Mahindre G (2021) Blind image denoising and inpainting using Robust Hadamard Autoencoders. arXiv preprint arXiv:2101.10876 Karkare R, Paffenroth RC, Mahindre G (2021) Blind image denoising and inpainting using Robust Hadamard Autoencoders. arXiv preprint arXiv:​2101.​10876
9.
Zurück zum Zitat Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Jie T (2021) Self-supervised learning: generative or contrastive. IEEE Transactions on Knowledge and Data Engineering Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Jie T (2021) Self-supervised learning: generative or contrastive. IEEE Transactions on Knowledge and Data Engineering
10.
Zurück zum Zitat LeCun Y, Misra I (2021) Self-supervised learning: The dark matter of intelligence LeCun Y, Misra I (2021) Self-supervised learning: The dark matter of intelligence
11.
Zurück zum Zitat Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103 Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103
12.
Zurück zum Zitat Liddy ED (2001) Natural language processing. School of Information Studies Liddy ED (2001) Natural language processing. School of Information Studies
13.
Zurück zum Zitat McKeown K (1992) Text generation. Cambridge University Press, Cambridge McKeown K (1992) Text generation. Cambridge University Press, Cambridge
14.
Zurück zum Zitat Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544 Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544
15.
Zurück zum Zitat Wang S-C (2003) Artificial neural network. In: Interdisciplinary computing in java programming, Springer, Boston, pp 81–100 Wang S-C (2003) Artificial neural network. In: Interdisciplinary computing in java programming, Springer, Boston, pp 81–100
16.
Zurück zum Zitat Zhao Y, Lebak S Deep convolutional autoencoder for recovering defocused license plates and smudged fingerprints Zhao Y, Lebak S Deep convolutional autoencoder for recovering defocused license plates and smudged fingerprints
18.
Zurück zum Zitat Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 37–49. JMLR Workshop and Conference Proceedings Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 37–49. JMLR Workshop and Conference Proceedings
19.
Zurück zum Zitat Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674 Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674
20.
Zurück zum Zitat Luo X, Chang X, Ban X (2016) Regression and classification using extreme learning machine based on l1-norm and l2-norm. Neurocomputing 174:179–186CrossRef Luo X, Chang X, Ban X (2016) Regression and classification using extreme learning machine based on l1-norm and l2-norm. Neurocomputing 174:179–186CrossRef
21.
Zurück zum Zitat Baldi P, Hornik K (1989) Neural networks and principal component analysis: learning from examples without local minima. Neural Netw 2(1):53–58CrossRef Baldi P, Hornik K (1989) Neural networks and principal component analysis: learning from examples without local minima. Neural Netw 2(1):53–58CrossRef
22.
Zurück zum Zitat Horn RA (1990) The hadamard product. In: Proceedings of symposia in applied mathematics, vol 40. pp 87–169 Horn RA (1990) The hadamard product. In: Proceedings of symposia in applied mathematics, vol 40. pp 87–169
23.
Zurück zum Zitat Bonollo F, Gramegna N, Timelli G (2015) High-pressure die-casting: contradictions and challenges. JOM 67(5):901–908CrossRef Bonollo F, Gramegna N, Timelli G (2015) High-pressure die-casting: contradictions and challenges. JOM 67(5):901–908CrossRef
24.
Zurück zum Zitat OEM Tech Brief. The US Aluminum Casting Industry OEM Tech Brief. The US Aluminum Casting Industry
25.
Zurück zum Zitat Ketkar N (2017) Introduction to pytorch. In: Deep learning with python, Apress, Berkeley, pp 195–208 Ketkar N (2017) Introduction to pytorch. In: Deep learning with python, Apress, Berkeley, pp 195–208
26.
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830 Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
27.
Zurück zum Zitat Van Der Walt S, Chris CS, Varoquaux G (2011) The numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30CrossRef Van Der Walt S, Chris CS, Varoquaux G (2011) The numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30CrossRef
28.
Zurück zum Zitat Tosi S (2009) Matplotlib for Python developers. Packt Publishing Ltd, Birmingham Tosi S (2009) Matplotlib for Python developers. Packt Publishing Ltd, Birmingham
29.
Zurück zum Zitat Feurer M, Hutter F (2019) Hyperparameter optimization. In: Automated machine learning, Springer, Cham, pp 3–33 Feurer M, Hutter F (2019) Hyperparameter optimization. In: Automated machine learning, Springer, Cham, pp 3–33
31.
Zurück zum Zitat Moreira M, Fiesler E (1995) Neural networks with adaptive learning rate and momentum terms. Technical report, Idiap Moreira M, Fiesler E (1995) Neural networks with adaptive learning rate and momentum terms. Technical report, Idiap
32.
Zurück zum Zitat Sharma S, Sharma S (2017) Activation functions in neural networks. Towards Data Sci 6(12):310–316 Sharma S, Sharma S (2017) Activation functions in neural networks. Towards Data Sci 6(12):310–316
33.
Zurück zum Zitat Zhang Z (2018) Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), IEEE, pp 1–2 Zhang Z (2018) Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), IEEE, pp 1–2
35.
Zurück zum Zitat Surkan AJ, Singleton JC (1990) Neural networks for bond rating improved by multiple hidden layers. In: 1990 IJCNN International Joint Conference on Neural Networks, IEEE, pp 157–162 Surkan AJ, Singleton JC (1990) Neural networks for bond rating improved by multiple hidden layers. In: 1990 IJCNN International Joint Conference on Neural Networks, IEEE, pp 157–162
36.
Zurück zum Zitat Rojas R (1996) The backpropagation algorithm. In: Neural networks, Springer, pp 149–182 Rojas R (1996) The backpropagation algorithm. In: Neural networks, Springer, pp 149–182
38.
Zurück zum Zitat James M (1932) Classification algorithms. Wiley-Interscience, New Jersey James M (1932) Classification algorithms. Wiley-Interscience, New Jersey
39.
Zurück zum Zitat Little RJA, Rubin DB (2002) Single imputation methods. In: Statistical analysis with missing data, pp 59–74 Little RJA, Rubin DB (2002) Single imputation methods. In: Statistical analysis with missing data, pp 59–74
40.
Zurück zum Zitat Jaiswal A, Ramesh Babu A, Zaki Zadeh M, Banerjee D, Makedon F (2021) A survey on contrastive self-supervised learning. Technologies 9(1):2CrossRef Jaiswal A, Ramesh Babu A, Zaki Zadeh M, Banerjee D, Makedon F (2021) A survey on contrastive self-supervised learning. Technologies 9(1):2CrossRef
Metadaten
Titel
Self-Supervised Deep Hadamard Autoencoders for Treating Missing Data: A Case Study in Manufacturing
verfasst von
Rasika Karkare
Randy Paffenroth
Diran Apelian
Publikationsdatum
14.03.2022
Verlag
Springer International Publishing
Erschienen in
Integrating Materials and Manufacturing Innovation / Ausgabe 2/2022
Print ISSN: 2193-9764
Elektronische ISSN: 2193-9772
DOI
https://doi.org/10.1007/s40192-022-00254-7

Weitere Artikel der Ausgabe 2/2022

Integrating Materials and Manufacturing Innovation 2/2022 Zur Ausgabe

Thematic Section: 6th World Congress on Integrated Computational Materials Engineering

Consistent Quantification of Precipitate Shapes and Sizes in Two and Three Dimensions Using Central Moments

    Marktübersichten

    Die im Laufe eines Jahres in der „adhäsion“ veröffentlichten Marktübersichten helfen Anwendern verschiedenster Branchen, sich einen gezielten Überblick über Lieferantenangebote zu verschaffen.