nach oben

Neural Processing Letters

Erschienen in:

01.08.2015

On the Design of Robust Linear Pattern Classifiers Based on \(M\)-Estimators

verfasst von: Guilherme A. Barreto, Ana Luiza B. P. Barros

Erschienen in: Neural Processing Letters | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Classical linear neural network architectures, such as the optimal linear associative memory (OLAM) Kohonen and Ruohonen (IEEE Trans Comp 22(7):701–702, 1973) and the adaptive linear element (Adaline) Widrow (IEEE Signal Process Mag 22(1):100–106, 2005; Widrow and Winter (IEEE Comp 21(3):25–39, 1988), are commonly used either as a standalone pattern classifier for linearly separable problems or as a fundamental building block of multilayer nonlinear classifiers, such as the multilayer perceptron (MLP), the radial basis functions networks (RBFN), the extreme learning machine (ELM) (Int J Mach Learn Cyber 2:107–122, 2011) and the echo-state network (ESN) Emmerich (Proceedings of the 20th international conference on artificial neural networks, 148–153, 2010). A common feature shared by the learning equations of OLAM and Adaline, respectively, the ordinary least squares (OLS) and the least mean squares (LMS) algorithms, is that they are optimal only under the assumption of gaussianity of the errors. However, the presence of outliers in the data causes the error distribution to depart from gaussianity and hence the classifier performance deteriorates. Bearing this in mind, in this paper we develop simple and efficient extensions of OLAM and Adaline, named Robust OLAM (ROLAM) and Robust Adaline (Radaline), which are robust to labeling errors (a.k.a. label noise), a type of outlier that often occur in classification tasks. This type of outlier usually results from mistakes during labelling the data points (e.g. misjudgement of a specialist) or from typing errors during creation of data files (e.g. by striking an incorrect key on a keyboard). To deal with such outliers, the ROLAM and the Radaline use \(M\)-estimators to compute the weights of the OLAM and Adaline networks, instead of using standard OLS/LMS algorithms. By means of comprehensive computer simulations using synthetic and real-world data sets, we show that the proposed robust linear classifiers consistently outperforms their original versions.

Vorheriger Artikel An Agent-Based Model for Simulating Environmental Behavior in an Educational Organization

Nächster Artikel Ant Colony Optimization Inspired Algorithm for 3D Object Segmentation into its Constituent Parts

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nur mit Berechtigung zugänglich

Also known as delta learning rule or the Widrow–Hoff learning rule [34].

First component of \(\mathbf {x}_{n}\) is equal to 1 in order to include the bias.

In other words, at iteration \(n\) or, equivalently, at the presentation of the \(n\)-th input pattern.

The \(H_{\infty }\) criterion has been introduced, initially in the control theory literature, as a means to ensure robust performance in the face of model uncertainties and lack of statistical information on the exogenous signals.

www.mathworks.com (Matlab) and www.R-project.org (R).

www.4shared.com/zip/HCELAcCLce/Robust_linear_classifiers.html.

Spondylolisthesis is the displacement of a vertebra or the vertebral column in relation to the vertebrae below.

Akusok A, Veganzones D, Miche Y, Severin E, Lendasse A (2014) Finding originally mislabels with md-elm. In: Proceedings of the 22th european symposium on artificial neural networks, computational intelligence and machine learning (ESANN’2014), pp 689–694

Alpaydin E, Jordan MI (1996) Local linear perceptrons for classification. IEEE Trans Neural Netw 7(3):788–792CrossRef

Anderson J (1972) A simple neural network generating an interactive memory. Math Biosci 14(3–4):197–220CrossRef

Ayad O (2014) Learning under concept drift with SVM. In: Proceedings of the 24th international conference on artificial neural networks (ICANN’2014), vol LNCS 8681, pp 587–594

Bolzern P, Colaneri P, De Nicolao G (1999) H\(_\infty \)-robustness of adaptive filters against measurement noise and parameter drift. Automatica 35(9):1509–1520CrossRef

Chan SC, Zhou Y (2010) On the performance analysis of the least mean \({M}\)-estimate and normalized least mean \({M}\)-estimate algorithms with gaussian inputs and additive gaussian and contaminated gaussian noises. J Signal Process Syst 80(1):81–103CrossRef

Chatterjee S, Hadi AS (1986) Influential observations, high leverage points, and outliers in linear regression. Stat Sci 1(3):379–393CrossRef

Cherkassky V, Fassett K, Vassilas N (1991) Linear algebra approach to neural associative memories and noise performance of neural classifiers. IEEE Trans Comput 40(12):1429–1435CrossRef

Dasgupta S, Kalai AT, Monteleoni C (2009) Analysis of perceptron-based active learning. J Mach Learn Res 10:281–299

10.

Duda RO, Hart PE, Stork DG (2006) Pattern classification, 2nd edn. Wiley, New York

11.

Eichmann G, Kasparis T (1989) Pattern classification using a linear associative memory. Pattern Recogn 22(6):733–740CrossRef

12.

Emmerich C, Reinhart F, Steil J (2010) Recurrence enhances the spatial encoding of static inputs in reservoir networks. In: Proceedings of the 20th international conference on artificial neural networks, vol LNCS 6353, Springer, pp 148–153

13.

Fox J (2002) An R and S-PLUS companion to applied regression. Sage Publications, Thousand Oaks

14.

Frank A, Asuncion A (2010) UCI machine learning repository. URL http://archive.ics.uci.edu/ml

15.

Frenay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 25(5):845–869CrossRef

16.

Freund Y, Schapire RE (1999) Large margin classification using the perceptron algorithm. Mach Learn 37(3):277–296CrossRef

17.

Frieß T-T, Harrison RF (1999) A kernel-based Adaline for function approximation. Intell Data Anal 3(4):307–313CrossRef

18.

Golub GH, van Loan CF (1996) Matrix Comput, 3rd edn. Johns Hopkins University Press, Baltimore

19.

Hassibi B, Sayed AH, Kailath T (1994) H\(_\infty \) optimality criteria for LMS and backpropagation. In: Cowan JD, Tesauro G, Alspector J (eds) Advances in neural information processing systems 6. morgan-kaufmann, San Mateo, pp 351–358

20.

Hassibi B, Sayed AH, Kailath T (1996) H\(_\infty \) optimality of the LMS algorithm algorithm. IEEE Trans Signal Process 44(2):267–280CrossRef

21.

Haykin S (2008) Neural networks and learning machines, 3rd edn. Prentice-Hall, New Jersey

22.

Huang G-B, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2:107–122CrossRef

23.

Huber PJ (1964) Robust estimation of a location parameter. Annal Math Stat 35(1):73–101CrossRef

24.

Huber PJ, Ronchetti EM (2009) Robust Stat. Wiley, New YorkCrossRef

25.

Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430CrossRef

26.

Kavak A, Yigit H, Ertunc HM (2005) Using Adaline neural network for performance improvement of smart antennas in tdd wireless communications. IEEE Trans Neural Netw 16(6):1616–1625CrossRef

27.

Kim H-C, Ghahramani Z (2008) Outlier robust gaussian process classification. In: Proceedings of the 2008 joint IAPR international workshop on structural, syntactic, and statistical pattern recognition (SSPR)’08, pp 896–905

28.

Kohonen T (1989) Self-organization and associative memory. Springer-Verlag, BerlinCrossRef

29.

Kohonen T, Ruohonen M (1973) Representation of associated data by matrix operators. IEEE Trans Comput 22(7):701–702CrossRef

30.

Liu W, Pokharel P, Principe J (2008) The kernel least-mean-square algorithm. IEEE Trans Signal Process 56(2):543–554CrossRef

31.

Nakano K (1972) Associatron: a model of associative memory. IEEE Trans Syst Man Cybern SMC–2(3):380–388CrossRef

32.

Oja E (1992) Principal components, minor components and linear neural networks. Neural Netw 5:927–935CrossRef

33.

Poggio T, Girosi F (1990) Networks for approximation and learning. Proc IEEE 78(9):1481–1497CrossRef

34.

Principe JC, Euliano NR, Lefebvre WC (2000) Neural and adaptive systems: fundamentals through simulations. Wiley, New York

35.

Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New YorkCrossRef

36.

Stevens JP (1984) Outliers and influential data points in regression analysis. Psychol Bull 95(2):334–344CrossRef

37.

Webb A (2002) Statistical pattern recognition, 2nd edn. Wiley, New YorkCrossRef

38.

Widrow B (2005) Thinking about thinking: the discovery of the LMS algorithm. IEEE Signal Process Mag 22(1):100–106CrossRef

39.

Widrow B, Kamenetsky M (2003) Statistical efficiency of adaptive algorithms. Neural Netw 16(5–6):735–744CrossRef

40.

Widrow B, Winter R (1988) Neural nets for adaptive filtering and adaptive pattern recognition. IEEE Comput 21(3):25–39CrossRef

41.

Williamson GA, Clarkson PM, Sethares WA (1993) Performance characteristics of the median LMS adaptive filter. IEEE Trans Signal Process 41(2):667–680CrossRef

42.

Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983

43.

Zhu X, Wu X (2004) Class noise versus attribute noise: a quantitative study. Artif Intell Rev 22(3):177–210CrossRef

44.

Zou Y, Chan SC, Ng TS (2000) Least mean \(M\)-estimate algorithms for robust adaptive filtering in impulsive noise. IEEE Trans Circuits Syst II 47(12):1564–1569CrossRef

Titel: On the Design of Robust Linear Pattern Classifiers Based on -Estimators
verfasst von: Guilherme A. Barreto
Ana Luiza B. P. Barros
Publikationsdatum: 01.08.2015
Verlag: Springer US
Erschienen in: Neural Processing Letters / Ausgabe 1/2015
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-014-9393-2

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Frank Urbansky/© Peter Eichler / Leipzig, CO2-Fußabdruck/© Jenny Sturm / stock.adobe.com, Interview Entropie Bild 1/© Bernhard Weßling, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 1/2015

An Agent-Based Model for Simulating Environmental Behavior in an Educational Organization

Using Discriminative Dimensionality Reduction to Visualize Classifiers

Homogenous Spiking Neural P Systems with Inhibitory Synapses

Scalable Semi-Supervised Classification via Neumann Series

Learning Temperature Dynamics on Agar-Based Phantom Tissue Surface During Single Point CO Laser Exposure

Multi-sensor Fusion Based on Asymmetric Decision Weighting for Robust Activity Recognition

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.