Skip to main content
Top
Published in: Integrating Materials and Manufacturing Innovation 2/2022

14-03-2022 | Technical Article

Self-Supervised Deep Hadamard Autoencoders for Treating Missing Data: A Case Study in Manufacturing

Authors: Rasika Karkare, Randy Paffenroth, Diran Apelian

Published in: Integrating Materials and Manufacturing Innovation | Issue 2/2022

Login to get access

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Data collected from sensors are pivotal to the Industrial Internet of Things (IIoT) applications as they would not be usable if the data quality is bad. Missing values commonly appear in industrial datasets due to several reasons. In particular, unexpected changes in the process, failure of the sensors that are recording the measurements due to network connectivity or hardware issues, censored/anonymous features and human error are some of the reasons for missing values in these datasets. In the manufacturing context, it is expensive and time consuming to collect all the different types of data related to a part leading to siloed and incomplete datasets. Due to characteristics such as presence of high missingness ratios, varying patterns of missingness, class imbalance and small-sized datasets, approaches such as single imputation or dropping the rows with missing values as is done in complete case analysis would result in loss of information and can also introduce a bias in the analysis depending on the underlying mechanism of missingness. The approach used for treatment of missing values must be carefully chosen based on the characteristics of the data at hand as the downstream generalization performance of algorithms trained on the imputed data would directly depend on the quality of imputation. In this work, we present a novel approach to treating missing values in a real-world dataset. Our approach uses deep Hadamard autoencoders (HAE) in addition to the self-supervised learning paradigm and shows better imputation performance on the unobserved entries of the test set as compared to approaches such as single imputation using mean, or other methods such as standard autoencoders (SAE). We demonstrate the effectiveness of the proposed method using a case study on a high-pressure die casting dataset (HPDC).
Appendix
Available only for authorised users
Footnotes
1
Github: https://github.com/rskarkare/Self-supervised-Hadamard-Autoencoders
 
Literature
1.
go back to reference Schwab K, Davis N, Nadella S (2018) Shaping the future of the fourth industrial revolution. World Economic Forum, Geneva Schwab K, Davis N, Nadella S (2018) Shaping the future of the fourth industrial revolution. World Economic Forum, Geneva
2.
go back to reference Hétu J-F, Gao DM, Kabanemi KK, Bergeron S, Nguyen KT, Loong CA (1998) Numerical modeling of casting processes. Adv Perform Mater 5(1):65–82CrossRef Hétu J-F, Gao DM, Kabanemi KK, Bergeron S, Nguyen KT, Loong CA (1998) Numerical modeling of casting processes. Adv Perform Mater 5(1):65–82CrossRef
4.
go back to reference Kwak D-S, Kim K-J (2012) A data mining approach considering missing values for the optimization of semiconductor-manufacturing processes. Expert Syst Appl 39(3):2590–2596CrossRef Kwak D-S, Kim K-J (2012) A data mining approach considering missing values for the optimization of semiconductor-manufacturing processes. Expert Syst Appl 39(3):2590–2596CrossRef
5.
go back to reference Kang S (2021) Product failure prediction with missing data using graph neural networks. Neural Comput Appl 33(12):7225–7234CrossRef Kang S (2021) Product failure prediction with missing data using graph neural networks. Neural Comput Appl 33(12):7225–7234CrossRef
6.
go back to reference Christian JJ, Gluud C, Wetterslev J, Winkel P (2017) When and how should multiple imputation be used for handling missing data in randomised clinical trials-a practical guide with flowcharts. BMC Med Res Methodol 17(1):1–10CrossRef Christian JJ, Gluud C, Wetterslev J, Winkel P (2017) When and how should multiple imputation be used for handling missing data in randomised clinical trials-a practical guide with flowcharts. BMC Med Res Methodol 17(1):1–10CrossRef
7.
go back to reference Frase J (2020) The challenge of Manufacturing Data Management Frase J (2020) The challenge of Manufacturing Data Management
8.
go back to reference Karkare R, Paffenroth RC, Mahindre G (2021) Blind image denoising and inpainting using Robust Hadamard Autoencoders. arXiv preprint arXiv:2101.10876 Karkare R, Paffenroth RC, Mahindre G (2021) Blind image denoising and inpainting using Robust Hadamard Autoencoders. arXiv preprint arXiv:​2101.​10876
9.
go back to reference Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Jie T (2021) Self-supervised learning: generative or contrastive. IEEE Transactions on Knowledge and Data Engineering Liu X, Zhang F, Hou Z, Mian L, Wang Z, Zhang J, Jie T (2021) Self-supervised learning: generative or contrastive. IEEE Transactions on Knowledge and Data Engineering
10.
go back to reference LeCun Y, Misra I (2021) Self-supervised learning: The dark matter of intelligence LeCun Y, Misra I (2021) Self-supervised learning: The dark matter of intelligence
11.
go back to reference Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103 Vincent P, Larochelle H, Bengio Y, Manzagol P-A (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103
12.
go back to reference Liddy ED (2001) Natural language processing. School of Information Studies Liddy ED (2001) Natural language processing. School of Information Studies
13.
go back to reference McKeown K (1992) Text generation. Cambridge University Press, Cambridge McKeown K (1992) Text generation. Cambridge University Press, Cambridge
14.
go back to reference Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544 Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2536–2544
15.
go back to reference Wang S-C (2003) Artificial neural network. In: Interdisciplinary computing in java programming, Springer, Boston, pp 81–100 Wang S-C (2003) Artificial neural network. In: Interdisciplinary computing in java programming, Springer, Boston, pp 81–100
16.
go back to reference Zhao Y, Lebak S Deep convolutional autoencoder for recovering defocused license plates and smudged fingerprints Zhao Y, Lebak S Deep convolutional autoencoder for recovering defocused license plates and smudged fingerprints
18.
go back to reference Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 37–49. JMLR Workshop and Conference Proceedings Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 37–49. JMLR Workshop and Conference Proceedings
19.
go back to reference Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674 Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 665–674
20.
go back to reference Luo X, Chang X, Ban X (2016) Regression and classification using extreme learning machine based on l1-norm and l2-norm. Neurocomputing 174:179–186CrossRef Luo X, Chang X, Ban X (2016) Regression and classification using extreme learning machine based on l1-norm and l2-norm. Neurocomputing 174:179–186CrossRef
21.
go back to reference Baldi P, Hornik K (1989) Neural networks and principal component analysis: learning from examples without local minima. Neural Netw 2(1):53–58CrossRef Baldi P, Hornik K (1989) Neural networks and principal component analysis: learning from examples without local minima. Neural Netw 2(1):53–58CrossRef
22.
go back to reference Horn RA (1990) The hadamard product. In: Proceedings of symposia in applied mathematics, vol 40. pp 87–169 Horn RA (1990) The hadamard product. In: Proceedings of symposia in applied mathematics, vol 40. pp 87–169
23.
go back to reference Bonollo F, Gramegna N, Timelli G (2015) High-pressure die-casting: contradictions and challenges. JOM 67(5):901–908CrossRef Bonollo F, Gramegna N, Timelli G (2015) High-pressure die-casting: contradictions and challenges. JOM 67(5):901–908CrossRef
24.
go back to reference OEM Tech Brief. The US Aluminum Casting Industry OEM Tech Brief. The US Aluminum Casting Industry
25.
go back to reference Ketkar N (2017) Introduction to pytorch. In: Deep learning with python, Apress, Berkeley, pp 195–208 Ketkar N (2017) Introduction to pytorch. In: Deep learning with python, Apress, Berkeley, pp 195–208
26.
go back to reference Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830 Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
27.
go back to reference Van Der Walt S, Chris CS, Varoquaux G (2011) The numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30CrossRef Van Der Walt S, Chris CS, Varoquaux G (2011) The numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30CrossRef
28.
go back to reference Tosi S (2009) Matplotlib for Python developers. Packt Publishing Ltd, Birmingham Tosi S (2009) Matplotlib for Python developers. Packt Publishing Ltd, Birmingham
29.
go back to reference Feurer M, Hutter F (2019) Hyperparameter optimization. In: Automated machine learning, Springer, Cham, pp 3–33 Feurer M, Hutter F (2019) Hyperparameter optimization. In: Automated machine learning, Springer, Cham, pp 3–33
31.
go back to reference Moreira M, Fiesler E (1995) Neural networks with adaptive learning rate and momentum terms. Technical report, Idiap Moreira M, Fiesler E (1995) Neural networks with adaptive learning rate and momentum terms. Technical report, Idiap
32.
go back to reference Sharma S, Sharma S (2017) Activation functions in neural networks. Towards Data Sci 6(12):310–316 Sharma S, Sharma S (2017) Activation functions in neural networks. Towards Data Sci 6(12):310–316
33.
go back to reference Zhang Z (2018) Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), IEEE, pp 1–2 Zhang Z (2018) Improved adam optimizer for deep neural networks. In: 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), IEEE, pp 1–2
35.
go back to reference Surkan AJ, Singleton JC (1990) Neural networks for bond rating improved by multiple hidden layers. In: 1990 IJCNN International Joint Conference on Neural Networks, IEEE, pp 157–162 Surkan AJ, Singleton JC (1990) Neural networks for bond rating improved by multiple hidden layers. In: 1990 IJCNN International Joint Conference on Neural Networks, IEEE, pp 157–162
36.
go back to reference Rojas R (1996) The backpropagation algorithm. In: Neural networks, Springer, pp 149–182 Rojas R (1996) The backpropagation algorithm. In: Neural networks, Springer, pp 149–182
38.
go back to reference James M (1932) Classification algorithms. Wiley-Interscience, New Jersey James M (1932) Classification algorithms. Wiley-Interscience, New Jersey
39.
go back to reference Little RJA, Rubin DB (2002) Single imputation methods. In: Statistical analysis with missing data, pp 59–74 Little RJA, Rubin DB (2002) Single imputation methods. In: Statistical analysis with missing data, pp 59–74
40.
go back to reference Jaiswal A, Ramesh Babu A, Zaki Zadeh M, Banerjee D, Makedon F (2021) A survey on contrastive self-supervised learning. Technologies 9(1):2CrossRef Jaiswal A, Ramesh Babu A, Zaki Zadeh M, Banerjee D, Makedon F (2021) A survey on contrastive self-supervised learning. Technologies 9(1):2CrossRef
Metadata
Title
Self-Supervised Deep Hadamard Autoencoders for Treating Missing Data: A Case Study in Manufacturing
Authors
Rasika Karkare
Randy Paffenroth
Diran Apelian
Publication date
14-03-2022
Publisher
Springer International Publishing
Published in
Integrating Materials and Manufacturing Innovation / Issue 2/2022
Print ISSN: 2193-9764
Electronic ISSN: 2193-9772
DOI
https://doi.org/10.1007/s40192-022-00254-7

Other articles of this Issue 2/2022

Integrating Materials and Manufacturing Innovation 2/2022 Go to the issue

Thematic Section: 6th World Congress on Integrated Computational Materials Engineering

Consistent Quantification of Precipitate Shapes and Sizes in Two and Three Dimensions Using Central Moments

Premium Partners