nach oben

Neural Processing Letters

Erschienen in:

23.08.2018

Competitive Cross-Entropy Loss: A Study on Training Single-Layer Neural Networks for Solving Nonlinearly Separable Classification Problems

verfasst von: Kamaledin Ghiasi-Shirazi

Erschienen in: Neural Processing Letters | Ausgabe 2/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

After Minsky and Papert (Perceptrons, MIT Press, Cambridge, 1969) showed the inability of perceptrons in solving nonlinearly separable problems, for several decades people misinterpreted it as an inherent weakness that is common to all single-layer neural networks. The introduction of the backpropagation algorithm reinforced this misinterpretation as its success in solving nonlinearly separable problems passed through the training of multilayer neural networks. Recently, Conaway and Kurtz (Neural Comput 29(3):861–866, 2017) proposed a single-layer network in which the number of output units for each class is the same as input units and showed that it could solve some nonlinearly separable problems. They used the MSE (Mean Square Error) between the input units and the output units of the actual class as the objective function for training the network. They showed that their method could solve the XOR and M&S’81 problems, but it could not do any better than random guessing on the 3-bit parity problem. In this paper, we use a soft competitive approach to generalize the CE (Cross-Entropy) loss, which is a widely accepted criterion for multiclass classification, to networks that have several output units for each class, calling the resulting measure the CCE (Competitive cross-entropy) loss. In contrast to Conaway and Kurtz (2017), in our method, the number of output units for each class can be chosen arbitrarily. We show that the proposed method can successfully solve the 3-bit parity problem, in addition to the XOR and M&S’81 problems. Furthermore, we perform experiments on several datasets for multiclass classification, comparing a single-layer network trained with the proposed CCE loss against LVQ, linear SVM, a single-layer network trained with the CE loss, and the method of Conaway and Kurtz (2017). The results show that the CCE loss performs remarkably better than existing algorithms for training single-layer neural networks.

Vorheriger Artikel Concave-Convex Programming for Ramp Loss-Based Maximum Margin and Minimum Volume Twin Spheres Machine

Nächster Artikel Stability of Inertial Neural Network with Time-Varying Delays Via Sampled-Data Control

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

These datasets can be downloaded from https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets.

Note that while in this experiment the proposed method has 60 output neurons, the number of output neurons for the method of Conaway and Kurtz [4] is 7840.

Bagarello F, Cinà M, Gargano F (2017) Projector operators in clustering. Math Methods Appl Sci 40(1):49–59MathSciNetCrossRef

Bishop CM (1995) Neural networks for pattern recognition. Oxford University Press, OxfordMATH

Chan TH, Jia K, Gao S, Lu J, Zeng Z, Ma Y (2015) Pcanet: a simple deep learning baseline for image classification? IEEE Trans Image Process 24(12):5017–5032MathSciNetCrossRef

Conaway N, Kurtz KJ (2017) Solving nonlinearly separable classifications in a single-layer neural network. Neural Comput 29(3):861–866MathSciNetCrossRef

Crammer K, Dekel O, Keshet J, Shalev-Shwartz S, Singer Y (2006) Online passive-aggressive algorithms. J Mach Learn Res 7(Mar):551–585MathSciNetMATH

Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9(Aug):1871–1874MATH

Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158CrossRef

Kohonen T (1995) Learning vector quantization. In: Self-organizing maps. Springer, pp 175–189

Kohonen T, Hynninen J, Kangas J, Laaksonen J, Torkkola K (1996) Lvq pak: the learning vector quantization program package. Tech. rep., Technical report, Laboratory of Computer and Information Science Rakentajanaukio 2 C, 1991–1992

10.

Martín-del Brío B (1996) A dot product neuron for hardware implementation of competitive networks. IEEE Trans Neural Netw 7(2):529–532CrossRef

11.

Medin DL, Schwanenflugel PJ (1981) Linear separability in classification learning. J Exp Psychol Hum Learn Mem 7(5):355CrossRef

12.

Mensink T, Verbeek J, Perronnin F, Csurka G (2013) Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans Pattern Anal Mach Intell 35(11):2624–2637CrossRef

13.

Minsky M, Papert S (1969) Perceptrons. MIT Press, CambridgeMATH

14.

Rosasco L, De Vito E, Caponnetto A, Piana M, Verri A (2004) Are loss functions all the same? Neural Comput 16(5):1063–1076CrossRef

15.

Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386CrossRef

16.

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–538CrossRef

17.

Siomau M (2014) A quantum model for autonomous learning automata. Quantum Inf Process 13(5):1211–1221MathSciNetCrossRef

18.

Urcid G, Ritter GX, Iancu L (2004) Single layer morphological perceptron solution to the n-bit parity problem. In: Iberoamerican congress on pattern recognition, Springer, pp 171–178

19.

Zhu G, Lin L, Jiang Y (2017) Resolve xor problem in a single layer neural network. In: IWACIII 2017-5th international workshop on advanced computational intelligence and intelligent informatics, Fuji Technology Press Ltd

Titel: Competitive Cross-Entropy Loss: A Study on Training Single-Layer Neural Networks for Solving Nonlinearly Separable Classification Problems
verfasst von: Kamaledin Ghiasi-Shirazi
Publikationsdatum: 23.08.2018
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 2/2019
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-018-9906-5

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2019

Zhang Neural Dynamics Approximated by Backward Difference Rules in Form of Time-Delay Differential Equation

Spatiotemporal Fusion Networks for Video Action Recognition

Daily Urban Water Demand Forecasting Based on Chaotic Theory and Continuous Deep Belief Neural Network

Robust Exponential Stabilization for Switched Neutral Neural Networks with Mixed Time-Varying Delays

Synchronization for Nonlinear Complex Spatio-Temporal Networks with Multiple Time-Invariant Delays and Multiple Time-Varying Delays

An Improved Structured Low-Rank Representation for Disjoint Subspace Segmentation

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.