Skip to main content
Erschienen in: International Journal of Data Science and Analytics 3/2022

12.05.2022 | Regular Paper

Overfitting measurement of convolutional neural networks using trained network weights

verfasst von: Satoru Watanabe, Hayato Yamana

Erschienen in: International Journal of Data Science and Analytics | Ausgabe 3/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Overfitting reduces the generalizability of convolutional neural networks (CNNs). Overfitting is generally detected by comparing the accuracies and losses of the training and validation data, where the validation data are formed from a portion of the training data; however, detection methods are ineffective for pretrained networks distributed without the training data. Thus, in this paper, we propose a method to detect overfitting of CNNs using the trained network weights inspired by the dropout technique. The dropout technique has been employed to prevent CNNs from overfitting, where the neurons in the CNNs are invalidated randomly during their training. It has been hypothesized that this technique prevents CNNs from overfitting by restraining the co-adaptations among neurons, and this hypothesis implies that the overfitting of CNNs results from co-adaptations among neurons and can be detected by investigating the inner representation of CNNs. The proposed persistent homology-based overfitting measure (PHOM) method constructs clique complexes in CNNs using the trained network weights, and the one-dimensional persistent homology investigates co-adaptations among neurons. In addition, we enhance PHOM to normalized PHOM (NPHOM) to mitigate fluctuation in PHOM caused by the difference in network structures. We applied the proposed methods to convolutional neural networks trained for the classification problems on the CIFAR-10, street view house number, Tiny ImageNet, and CIFAR-100 datasets. Experimental results demonstrate that PHOM and NPHOM can indicate the degree of overfitting of CNNs, which suggests that these methods enable us to filter overfitted CNNs without requiring the training data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
We used the notation from https://​towardsdatascien​ce.​com/​convolutional-neural-networks-mathematics-1beb3e6447c0 with modifications based on our understanding.
 
2
The source code and models used in the evaluation can be accessed at https://​github.​com/​satoru-watanabe-aw/​phom/​.
 
Literatur
1.
Zurück zum Zitat Dictionary, O.: Oxford dictionaries. Language Matters (2014) Dictionary, O.: Oxford dictionaries. Language Matters (2014)
2.
Zurück zum Zitat Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Switzerland (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Switzerland (2006)MATH
3.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH
4.
Zurück zum Zitat Rieck, B., Togninalli, M., Bock, C., Moor, M., Horn, M., Gumbsch, T., Borgwardt, K.: Neural persistence: a complexity measure for deep neural networks using algebraic topology. In: International Conference on Learning Representations (2018) Rieck, B., Togninalli, M., Bock, C., Moor, M., Horn, M., Gumbsch, T., Borgwardt, K.: Neural persistence: a complexity measure for deep neural networks using algebraic topology. In: International Conference on Learning Representations (2018)
5.
Zurück zum Zitat Corneanu, C.A., Madadi, M., Escalera, S., Martinez, A.M.: What does it mean to learn in deep networks? and, how does one detect adversarial attacks? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4757–4766 (2019) Corneanu, C.A., Madadi, M., Escalera, S., Martinez, A.M.: What does it mean to learn in deep networks? and, how does one detect adversarial attacks? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4757–4766 (2019)
6.
Zurück zum Zitat Corneanu, C.A., Escalera, S., Martinez, A.M.: Computing the testing error without a testing set. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2677–2685 (2020) Corneanu, C.A., Escalera, S., Martinez, A.M.: Computing the testing error without a testing set. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2677–2685 (2020)
7.
Zurück zum Zitat Watanabe, S., Yamana, H.: Topological measurement of deep neural networks using persistent homology. In: ISAIM (2020) Watanabe, S., Yamana, H.: Topological measurement of deep neural networks using persistent homology. In: ISAIM (2020)
9.
Zurück zum Zitat Watanabe, S., Yamana, H.: Deep neural network pruning using persistent homology. In: 2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pp. 153–156. IEEE (2020) Watanabe, S., Yamana, H.: Deep neural network pruning using persistent homology. In: 2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pp. 153–156. IEEE (2020)
10.
Zurück zum Zitat Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. EPJ Data Sci. 6(1), 17 (2017)CrossRef Otter, N., Porter, M.A., Tillmann, U., Grindrod, P., Harrington, H.A.: A roadmap for the computation of persistent homology. EPJ Data Sci. 6(1), 17 (2017)CrossRef
11.
Zurück zum Zitat Petri, G., Expert, P., Turkheimer, F., Carhart-Harris, R., Nutt, D., Hellyer, P.J., Vaccarino, F.: Homological scaffolds of brain functional networks. J. R. Soc. Interface 11(101), 20140873 (2014)CrossRef Petri, G., Expert, P., Turkheimer, F., Carhart-Harris, R., Nutt, D., Hellyer, P.J., Vaccarino, F.: Homological scaffolds of brain functional networks. J. R. Soc. Interface 11(101), 20140873 (2014)CrossRef
13.
Zurück zum Zitat Sizemore, A.E., Giusti, C., Kahn, A., Vettel, J.M., Betzel, R.F., Bassett, D.S.: Cliques and cavities in the human connectome. J. Comput. Neurosci. 44(1), 115–145 (2018)MathSciNetCrossRef Sizemore, A.E., Giusti, C., Kahn, A., Vettel, J.M., Betzel, R.F., Bassett, D.S.: Cliques and cavities in the human connectome. J. Comput. Neurosci. 44(1), 115–145 (2018)MathSciNetCrossRef
14.
Zurück zum Zitat Xia, K., Wei, G.-W.: Persistent homology analysis of protein structure, flexibility, and folding. Int. J. Numer. Methods Biomed. Eng. 30(8), 814–844 (2014)MathSciNetCrossRef Xia, K., Wei, G.-W.: Persistent homology analysis of protein structure, flexibility, and folding. Int. J. Numer. Methods Biomed. Eng. 30(8), 814–844 (2014)MathSciNetCrossRef
15.
Zurück zum Zitat Gameiro, M., Hiraoka, Y., Izumi, S., Kramar, M., Mischaikow, K., Nanda, V.: A topological measurement of protein compressibility. Jpn. J. Ind. Appl. Math. 32(1), 1–17 (2015)MathSciNetCrossRef Gameiro, M., Hiraoka, Y., Izumi, S., Kramar, M., Mischaikow, K., Nanda, V.: A topological measurement of protein compressibility. Jpn. J. Ind. Appl. Math. 32(1), 1–17 (2015)MathSciNetCrossRef
16.
Zurück zum Zitat Kramar, M., Goullet, A., Kondic, L., Mischaikow, K.: Persistence of force networks in compressed granular media. Phys. Rev. E 87(4), 042207 (2013)CrossRef Kramar, M., Goullet, A., Kondic, L., Mischaikow, K.: Persistence of force networks in compressed granular media. Phys. Rev. E 87(4), 042207 (2013)CrossRef
17.
Zurück zum Zitat Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Natl. Acad. Sci. 113(26), 7035–7040 (2016)CrossRef Hiraoka, Y., Nakamura, T., Hirata, A., Escolar, E.G., Matsue, K., Nishiura, Y.: Hierarchical structures of amorphous solids characterized by persistent homology. Proc. Natl. Acad. Sci. 113(26), 7035–7040 (2016)CrossRef
19.
Zurück zum Zitat Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. American Mathematical Soc., Providence (2010)MATH Edelsbrunner, H., Harer, J.: Computational Topology: An Introduction. American Mathematical Soc., Providence (2010)MATH
21.
Zurück zum Zitat Tausz, A., Vejdemo-Johansson, M., Adams, H.: JavaPlex: a research software package for persistent (co)homology. In: Hong, H., Yap, C. (eds.) Proceedings of ICMS 2014. Lecture Notes in Computer Science, vol 8592, pp. 129–136 (2014). Software available at http://appliedtopology.github.io/javaplex/ Tausz, A., Vejdemo-Johansson, M., Adams, H.: JavaPlex: a research software package for persistent (co)homology. In: Hong, H., Yap, C. (eds.) Proceedings of ICMS 2014. Lecture Notes in Computer Science, vol 8592, pp. 129–136 (2014). Software available at http://​appliedtopology.​github.​io/​javaplex/​
22.
Zurück zum Zitat Edelsbrunner, H., Morozov, D.: Persistent homology: theory and practice. Technical report, Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States) (2012) Edelsbrunner, H., Morozov, D.: Persistent homology: theory and practice. Technical report, Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States) (2012)
23.
Zurück zum Zitat Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. In: Proceedings 41st Annual Symposium on Foundations of Computer Science, pp. 454–463. IEEE (2000) Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. In: Proceedings 41st Annual Symposium on Foundations of Computer Science, pp. 454–463. IEEE (2000)
24.
Zurück zum Zitat Masulli, P., Villa, A.E.: The topology of the directed clique complex as a network invariant. Springerplus 5(1), 388 (2016)CrossRef Masulli, P., Villa, A.E.: The topology of the directed clique complex as a network invariant. Springerplus 5(1), 388 (2016)CrossRef
25.
Zurück zum Zitat Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.-R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017)CrossRef Montavon, G., Lapuschkin, S., Binder, A., Samek, W., Müller, K.-R.: Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recogn. 65, 211–222 (2017)CrossRef
26.
Zurück zum Zitat Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17(3), 261–272 (2020)CrossRef Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., et al.: Scipy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17(3), 261–272 (2020)CrossRef
27.
Zurück zum Zitat Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009) Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
28.
Zurück zum Zitat Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011) Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning (2011)
29.
Zurück zum Zitat Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N 7(7), 3 (2015) Le, Y., Yang, X.: Tiny imagenet visual recognition challenge. CS 231N 7(7), 3 (2015)
30.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings (2015)
31.
Zurück zum Zitat Gulli, A., Pal, S.: Deep Learning with Keras. Packt Publishing Ltd., Birmingham (2017) Gulli, A., Pal, S.: Deep Learning with Keras. Packt Publishing Ltd., Birmingham (2017)
32.
Zurück zum Zitat Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 27, 3320–3328 (2014) Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 27, 3320–3328 (2014)
33.
Zurück zum Zitat Chollet, F., et al.: Deep Learning with Python, vol. 361. Manning New York, New York (2018) Chollet, F., et al.: Deep Learning with Python, vol. 361. Manning New York, New York (2018)
34.
Zurück zum Zitat Mishra, R., Gupta, H.P., Dutta, T.: A survey on deep neural network compression: Challenges, overview, and solutions. arXiv preprint arXiv:2010.03954 (2020) Mishra, R., Gupta, H.P., Dutta, T.: A survey on deep neural network compression: Challenges, overview, and solutions. arXiv preprint arXiv:​2010.​03954 (2020)
35.
Zurück zum Zitat Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? In: Dhillon, I., Papailiopoulos, D., Sze, V. (eds.) Proceedings of Machine Learning and Systems, vol. 2, pp. 129–146 (2020) Blalock, D., Gonzalez Ortiz, J.J., Frankle, J., Guttag, J.: What is the state of neural network pruning? In: Dhillon, I., Papailiopoulos, D., Sze, V. (eds.) Proceedings of Machine Learning and Systems, vol. 2, pp. 129–146 (2020)
36.
37.
Zurück zum Zitat Werpachowski, R., György, A., Szepesvari, C.: Detecting overfitting via adversarial examples. Adv. Neural Inf. Process. Syst. 32, 7858–7868 (2019) Werpachowski, R., György, A., Szepesvari, C.: Detecting overfitting via adversarial examples. Adv. Neural Inf. Process. Syst. 32, 7858–7868 (2019)
38.
Zurück zum Zitat Grosse, K., Lee, T., Park, Y., Backes, M., Molloy, I.M.: A new measure for overfitting and its implications for backdooring of deep learning. CoRR arXiv: 2006.06721 (2020) Grosse, K., Lee, T., Park, Y., Backes, M., Molloy, I.M.: A new measure for overfitting and its implications for backdooring of deep learning. CoRR arXiv:​ 2006.​06721 (2020)
39.
Zurück zum Zitat Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: SVCCA: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: Advances in Neural Information Processing Systems, pp. 6076–6085 (2017) Raghu, M., Gilmer, J., Yosinski, J., Sohl-Dickstein, J.: SVCCA: singular vector canonical correlation analysis for deep learning dynamics and interpretability. In: Advances in Neural Information Processing Systems, pp. 6076–6085 (2017)
40.
Zurück zum Zitat Morcos, A., Raghu, M., Bengio, S.: Insights on representational similarity in neural networks with canonical correlation. In: Advances in Neural Information Processing Systems, pp. 5727–5736 ( 2018) Morcos, A., Raghu, M., Bengio, S.: Insights on representational similarity in neural networks with canonical correlation. In: Advances in Neural Information Processing Systems, pp. 5727–5736 ( 2018)
41.
Zurück zum Zitat Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of neural network representations revisited. In: International Conference on Machine Learning, pp. 3519–3529 (2019) Kornblith, S., Norouzi, M., Lee, H., Hinton, G.: Similarity of neural network representations revisited. In: International Conference on Machine Learning, pp. 3519–3529 (2019)
Metadaten
Titel
Overfitting measurement of convolutional neural networks using trained network weights
verfasst von
Satoru Watanabe
Hayato Yamana
Publikationsdatum
12.05.2022
Verlag
Springer International Publishing
Erschienen in
International Journal of Data Science and Analytics / Ausgabe 3/2022
Print ISSN: 2364-415X
Elektronische ISSN: 2364-4168
DOI
https://doi.org/10.1007/s41060-022-00332-1

Weitere Artikel der Ausgabe 3/2022

International Journal of Data Science and Analytics 3/2022 Zur Ausgabe