Skip to main content
Erschienen in: Neural Processing Letters 1/2015

01.08.2015

On the Design of Robust Linear Pattern Classifiers Based on \(M\)-Estimators

verfasst von: Guilherme A. Barreto, Ana Luiza B. P. Barros

Erschienen in: Neural Processing Letters | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Classical linear neural network architectures, such as the optimal linear associative memory (OLAM) Kohonen and Ruohonen (IEEE Trans Comp 22(7):701–702, 1973) and the adaptive linear element (Adaline) Widrow (IEEE Signal Process Mag 22(1):100–106, 2005; Widrow and Winter (IEEE Comp 21(3):25–39, 1988), are commonly used either as a standalone pattern classifier for linearly separable problems or as a fundamental building block of multilayer nonlinear classifiers, such as the multilayer perceptron (MLP), the radial basis functions networks (RBFN), the extreme learning machine (ELM) (Int J Mach Learn Cyber 2:107–122, 2011) and the echo-state network (ESN) Emmerich (Proceedings of the 20th international conference on artificial neural networks, 148–153,  2010). A common feature shared by the learning equations of OLAM and Adaline, respectively, the ordinary least squares (OLS) and the least mean squares (LMS) algorithms, is that they are optimal only under the assumption of gaussianity of the errors. However, the presence of outliers in the data causes the error distribution to depart from gaussianity and hence the classifier performance deteriorates. Bearing this in mind, in this paper we develop simple and efficient extensions of OLAM and Adaline, named Robust OLAM (ROLAM) and Robust Adaline (Radaline), which are robust to labeling errors (a.k.a. label noise), a type of outlier that often occur in classification tasks. This type of outlier usually results from mistakes during labelling the data points (e.g. misjudgement of a specialist) or from typing errors during creation of data files (e.g. by striking an incorrect key on a keyboard). To deal with such outliers, the ROLAM and the Radaline use \(M\)-estimators to compute the weights of the OLAM and Adaline networks, instead of using standard OLS/LMS algorithms. By means of comprehensive computer simulations using synthetic and real-world data sets, we show that the proposed robust linear classifiers consistently outperforms their original versions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Also known as delta learning rule or the Widrow–Hoff learning rule [34].
 
2
First component of \(\mathbf {x}_{n}\) is equal to 1 in order to include the bias.
 
3
In other words, at iteration \(n\) or, equivalently, at the presentation of the \(n\)-th input pattern.
 
4
The \(H_{\infty }\) criterion has been introduced, initially in the control theory literature, as a means to ensure robust performance in the face of model uncertainties and lack of statistical information on the exogenous signals.
 
7
Spondylolisthesis is the displacement of a vertebra or the vertebral column in relation to the vertebrae below.
 
Literatur
1.
Zurück zum Zitat Akusok A, Veganzones D, Miche Y, Severin E, Lendasse A (2014) Finding originally mislabels with md-elm. In: Proceedings of the 22th european symposium on artificial neural networks, computational intelligence and machine learning (ESANN’2014), pp 689–694 Akusok A, Veganzones D, Miche Y, Severin E, Lendasse A (2014) Finding originally mislabels with md-elm. In: Proceedings of the 22th european symposium on artificial neural networks, computational intelligence and machine learning (ESANN’2014), pp 689–694
2.
Zurück zum Zitat Alpaydin E, Jordan MI (1996) Local linear perceptrons for classification. IEEE Trans Neural Netw 7(3):788–792CrossRef Alpaydin E, Jordan MI (1996) Local linear perceptrons for classification. IEEE Trans Neural Netw 7(3):788–792CrossRef
3.
Zurück zum Zitat Anderson J (1972) A simple neural network generating an interactive memory. Math Biosci 14(3–4):197–220CrossRef Anderson J (1972) A simple neural network generating an interactive memory. Math Biosci 14(3–4):197–220CrossRef
4.
Zurück zum Zitat Ayad O (2014) Learning under concept drift with SVM. In: Proceedings of the 24th international conference on artificial neural networks (ICANN’2014), vol LNCS 8681, pp 587–594 Ayad O (2014) Learning under concept drift with SVM. In: Proceedings of the 24th international conference on artificial neural networks (ICANN’2014), vol LNCS 8681, pp 587–594
5.
Zurück zum Zitat Bolzern P, Colaneri P, De Nicolao G (1999) H\(_\infty \)-robustness of adaptive filters against measurement noise and parameter drift. Automatica 35(9):1509–1520CrossRef Bolzern P, Colaneri P, De Nicolao G (1999) H\(_\infty \)-robustness of adaptive filters against measurement noise and parameter drift. Automatica 35(9):1509–1520CrossRef
6.
Zurück zum Zitat Chan SC, Zhou Y (2010) On the performance analysis of the least mean \({M}\)-estimate and normalized least mean \({M}\)-estimate algorithms with gaussian inputs and additive gaussian and contaminated gaussian noises. J Signal Process Syst 80(1):81–103CrossRef Chan SC, Zhou Y (2010) On the performance analysis of the least mean \({M}\)-estimate and normalized least mean \({M}\)-estimate algorithms with gaussian inputs and additive gaussian and contaminated gaussian noises. J Signal Process Syst 80(1):81–103CrossRef
7.
Zurück zum Zitat Chatterjee S, Hadi AS (1986) Influential observations, high leverage points, and outliers in linear regression. Stat Sci 1(3):379–393CrossRef Chatterjee S, Hadi AS (1986) Influential observations, high leverage points, and outliers in linear regression. Stat Sci 1(3):379–393CrossRef
8.
Zurück zum Zitat Cherkassky V, Fassett K, Vassilas N (1991) Linear algebra approach to neural associative memories and noise performance of neural classifiers. IEEE Trans Comput 40(12):1429–1435CrossRef Cherkassky V, Fassett K, Vassilas N (1991) Linear algebra approach to neural associative memories and noise performance of neural classifiers. IEEE Trans Comput 40(12):1429–1435CrossRef
9.
Zurück zum Zitat Dasgupta S, Kalai AT, Monteleoni C (2009) Analysis of perceptron-based active learning. J Mach Learn Res 10:281–299 Dasgupta S, Kalai AT, Monteleoni C (2009) Analysis of perceptron-based active learning. J Mach Learn Res 10:281–299
10.
Zurück zum Zitat Duda RO, Hart PE, Stork DG (2006) Pattern classification, 2nd edn. Wiley, New York Duda RO, Hart PE, Stork DG (2006) Pattern classification, 2nd edn. Wiley, New York
11.
Zurück zum Zitat Eichmann G, Kasparis T (1989) Pattern classification using a linear associative memory. Pattern Recogn 22(6):733–740CrossRef Eichmann G, Kasparis T (1989) Pattern classification using a linear associative memory. Pattern Recogn 22(6):733–740CrossRef
12.
Zurück zum Zitat Emmerich C, Reinhart F, Steil J (2010) Recurrence enhances the spatial encoding of static inputs in reservoir networks. In: Proceedings of the 20th international conference on artificial neural networks, vol LNCS 6353, Springer, pp 148–153 Emmerich C, Reinhart F, Steil J (2010) Recurrence enhances the spatial encoding of static inputs in reservoir networks. In: Proceedings of the 20th international conference on artificial neural networks, vol LNCS 6353, Springer, pp 148–153
13.
Zurück zum Zitat Fox J (2002) An R and S-PLUS companion to applied regression. Sage Publications, Thousand Oaks Fox J (2002) An R and S-PLUS companion to applied regression. Sage Publications, Thousand Oaks
15.
Zurück zum Zitat Frenay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 25(5):845–869CrossRef Frenay B, Verleysen M (2014) Classification in the presence of label noise: a survey. IEEE Trans Neural Netw Learn Syst 25(5):845–869CrossRef
16.
Zurück zum Zitat Freund Y, Schapire RE (1999) Large margin classification using the perceptron algorithm. Mach Learn 37(3):277–296CrossRef Freund Y, Schapire RE (1999) Large margin classification using the perceptron algorithm. Mach Learn 37(3):277–296CrossRef
17.
Zurück zum Zitat Frieß T-T, Harrison RF (1999) A kernel-based Adaline for function approximation. Intell Data Anal 3(4):307–313CrossRef Frieß T-T, Harrison RF (1999) A kernel-based Adaline for function approximation. Intell Data Anal 3(4):307–313CrossRef
18.
Zurück zum Zitat Golub GH, van Loan CF (1996) Matrix Comput, 3rd edn. Johns Hopkins University Press, Baltimore Golub GH, van Loan CF (1996) Matrix Comput, 3rd edn. Johns Hopkins University Press, Baltimore
19.
Zurück zum Zitat Hassibi B, Sayed AH, Kailath T (1994) H\(_\infty \) optimality criteria for LMS and backpropagation. In: Cowan JD, Tesauro G, Alspector J (eds) Advances in neural information processing systems 6. morgan-kaufmann, San Mateo, pp 351–358 Hassibi B, Sayed AH, Kailath T (1994) H\(_\infty \) optimality criteria for LMS and backpropagation. In: Cowan JD, Tesauro G, Alspector J (eds) Advances in neural information processing systems 6. morgan-kaufmann, San Mateo, pp 351–358
20.
Zurück zum Zitat Hassibi B, Sayed AH, Kailath T (1996) H\(_\infty \) optimality of the LMS algorithm algorithm. IEEE Trans Signal Process 44(2):267–280CrossRef Hassibi B, Sayed AH, Kailath T (1996) H\(_\infty \) optimality of the LMS algorithm algorithm. IEEE Trans Signal Process 44(2):267–280CrossRef
21.
Zurück zum Zitat Haykin S (2008) Neural networks and learning machines, 3rd edn. Prentice-Hall, New Jersey Haykin S (2008) Neural networks and learning machines, 3rd edn. Prentice-Hall, New Jersey
22.
Zurück zum Zitat Huang G-B, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2:107–122CrossRef Huang G-B, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2:107–122CrossRef
23.
Zurück zum Zitat Huber PJ (1964) Robust estimation of a location parameter. Annal Math Stat 35(1):73–101CrossRef Huber PJ (1964) Robust estimation of a location parameter. Annal Math Stat 35(1):73–101CrossRef
24.
25.
Zurück zum Zitat Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430CrossRef Hyvärinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural Netw 13(4–5):411–430CrossRef
26.
Zurück zum Zitat Kavak A, Yigit H, Ertunc HM (2005) Using Adaline neural network for performance improvement of smart antennas in tdd wireless communications. IEEE Trans Neural Netw 16(6):1616–1625CrossRef Kavak A, Yigit H, Ertunc HM (2005) Using Adaline neural network for performance improvement of smart antennas in tdd wireless communications. IEEE Trans Neural Netw 16(6):1616–1625CrossRef
27.
Zurück zum Zitat Kim H-C, Ghahramani Z (2008) Outlier robust gaussian process classification. In: Proceedings of the 2008 joint IAPR international workshop on structural, syntactic, and statistical pattern recognition (SSPR)’08, pp 896–905 Kim H-C, Ghahramani Z (2008) Outlier robust gaussian process classification. In: Proceedings of the 2008 joint IAPR international workshop on structural, syntactic, and statistical pattern recognition (SSPR)’08, pp 896–905
28.
Zurück zum Zitat Kohonen T (1989) Self-organization and associative memory. Springer-Verlag, BerlinCrossRef Kohonen T (1989) Self-organization and associative memory. Springer-Verlag, BerlinCrossRef
29.
Zurück zum Zitat Kohonen T, Ruohonen M (1973) Representation of associated data by matrix operators. IEEE Trans Comput 22(7):701–702CrossRef Kohonen T, Ruohonen M (1973) Representation of associated data by matrix operators. IEEE Trans Comput 22(7):701–702CrossRef
30.
Zurück zum Zitat Liu W, Pokharel P, Principe J (2008) The kernel least-mean-square algorithm. IEEE Trans Signal Process 56(2):543–554CrossRef Liu W, Pokharel P, Principe J (2008) The kernel least-mean-square algorithm. IEEE Trans Signal Process 56(2):543–554CrossRef
31.
Zurück zum Zitat Nakano K (1972) Associatron: a model of associative memory. IEEE Trans Syst Man Cybern SMC–2(3):380–388CrossRef Nakano K (1972) Associatron: a model of associative memory. IEEE Trans Syst Man Cybern SMC–2(3):380–388CrossRef
32.
Zurück zum Zitat Oja E (1992) Principal components, minor components and linear neural networks. Neural Netw 5:927–935CrossRef Oja E (1992) Principal components, minor components and linear neural networks. Neural Netw 5:927–935CrossRef
33.
Zurück zum Zitat Poggio T, Girosi F (1990) Networks for approximation and learning. Proc IEEE 78(9):1481–1497CrossRef Poggio T, Girosi F (1990) Networks for approximation and learning. Proc IEEE 78(9):1481–1497CrossRef
34.
Zurück zum Zitat Principe JC, Euliano NR, Lefebvre WC (2000) Neural and adaptive systems: fundamentals through simulations. Wiley, New York Principe JC, Euliano NR, Lefebvre WC (2000) Neural and adaptive systems: fundamentals through simulations. Wiley, New York
35.
Zurück zum Zitat Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New YorkCrossRef Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New YorkCrossRef
36.
Zurück zum Zitat Stevens JP (1984) Outliers and influential data points in regression analysis. Psychol Bull 95(2):334–344CrossRef Stevens JP (1984) Outliers and influential data points in regression analysis. Psychol Bull 95(2):334–344CrossRef
37.
Zurück zum Zitat Webb A (2002) Statistical pattern recognition, 2nd edn. Wiley, New YorkCrossRef Webb A (2002) Statistical pattern recognition, 2nd edn. Wiley, New YorkCrossRef
38.
Zurück zum Zitat Widrow B (2005) Thinking about thinking: the discovery of the LMS algorithm. IEEE Signal Process Mag 22(1):100–106CrossRef Widrow B (2005) Thinking about thinking: the discovery of the LMS algorithm. IEEE Signal Process Mag 22(1):100–106CrossRef
39.
Zurück zum Zitat Widrow B, Kamenetsky M (2003) Statistical efficiency of adaptive algorithms. Neural Netw 16(5–6):735–744CrossRef Widrow B, Kamenetsky M (2003) Statistical efficiency of adaptive algorithms. Neural Netw 16(5–6):735–744CrossRef
40.
Zurück zum Zitat Widrow B, Winter R (1988) Neural nets for adaptive filtering and adaptive pattern recognition. IEEE Comput 21(3):25–39CrossRef Widrow B, Winter R (1988) Neural nets for adaptive filtering and adaptive pattern recognition. IEEE Comput 21(3):25–39CrossRef
41.
Zurück zum Zitat Williamson GA, Clarkson PM, Sethares WA (1993) Performance characteristics of the median LMS adaptive filter. IEEE Trans Signal Process 41(2):667–680CrossRef Williamson GA, Clarkson PM, Sethares WA (1993) Performance characteristics of the median LMS adaptive filter. IEEE Trans Signal Process 41(2):667–680CrossRef
42.
Zurück zum Zitat Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983 Wu Y, Liu Y (2007) Robust truncated hinge loss support vector machines. J Am Stat Assoc 102(479):974–983
43.
Zurück zum Zitat Zhu X, Wu X (2004) Class noise versus attribute noise: a quantitative study. Artif Intell Rev 22(3):177–210CrossRef Zhu X, Wu X (2004) Class noise versus attribute noise: a quantitative study. Artif Intell Rev 22(3):177–210CrossRef
44.
Zurück zum Zitat Zou Y, Chan SC, Ng TS (2000) Least mean \(M\)-estimate algorithms for robust adaptive filtering in impulsive noise. IEEE Trans Circuits Syst II 47(12):1564–1569CrossRef Zou Y, Chan SC, Ng TS (2000) Least mean \(M\)-estimate algorithms for robust adaptive filtering in impulsive noise. IEEE Trans Circuits Syst II 47(12):1564–1569CrossRef
Metadaten
Titel
On the Design of Robust Linear Pattern Classifiers Based on -Estimators
verfasst von
Guilherme A. Barreto
Ana Luiza B. P. Barros
Publikationsdatum
01.08.2015
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 1/2015
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-014-9393-2

Weitere Artikel der Ausgabe 1/2015

Neural Processing Letters 1/2015 Zur Ausgabe

Neuer Inhalt