Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 12/2019

05.03.2019 | Original Article

Fast and unsupervised outlier removal by recurrent adaptive reconstruction extreme learning machine

verfasst von: Wang Siqi, Liu Qiang, Guo Xifeng, Zhu En, Yin Jianping

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 12/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Outlier removal is vital in machine learning. As massive unlabeled data are generated rapidly today, eliminating outliers from noisy data in a fast and unsupervised manner is gaining increasing attention in practical applications. This paper tackles this challenging problem by proposing a novel Recurrent Adaptive Reconstruction Extreme Learning Machine (RAR-ELM). Specifically, with the given noisy data collection, RAR-ELM recurrently learns to reconstruct data and automatically excludes those data with high reconstruction errors as outliers by a novel adaptive labeling mechanism. Compared with existing methods, the proposed RAR-ELM enjoys three major merits: first, RAR-ELM inherits the fast and sound learning property of original extreme learning machine (ELM). RAR-ELM can be implemented at a tens or hundreds of times faster speed while achieving a superior or comparable outlier removal performance to existing methods, which makes RAR-ELM particularly suitable for application scenarios like real-time outlier removal; secondly, instead of priorly specifying a decision threshold, RAR-ELM is able to adaptively find a reasonable decision threshold when processing data with different proportions of outliers, which is vital to the case of unsupervised outlier removal where no prior knowledge of outliers in the data is available; thirdly, we also propose Online Sequential RAR-ELM (OS-RAR-ELM) can be implemented by an online or sequential mode, which makes RAR-ELM easily applicable to massive noisy data or online sequential data. Extensive experiments on various datasets reveal that the proposed RAR-ELM can realize faster and better unsupervised outlier removal in contrast to existing methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Russakovsky O, Deng J, Hao S, Krause J, Satheesh S, Ma S (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef Russakovsky O, Deng J, Hao S, Krause J, Satheesh S, Ma S (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef
2.
Zurück zum Zitat Schroff F, Criminisi A, Zisserman A (2007) Harvesting image databases from the web. IEEE Int Conf Comput Vis 33:1–8 Schroff F, Criminisi A, Zisserman A (2007) Harvesting image databases from the web. IEEE Int Conf Comput Vis 33:1–8
3.
4.
Zurück zum Zitat Perdisci R, Gu G, Lee W (2007) Using an ensemble of one-class SVM classifiers to Harden Payload-based anomaly detection systems. In: International conference on data mining, IEEE, pp. 488–498 Perdisci R, Gu G, Lee W (2007) Using an ensemble of one-class SVM classifiers to Harden Payload-based anomaly detection systems. In: International conference on data mining, IEEE, pp. 488–498
5.
Zurück zum Zitat Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. IEEE Comput Vis Pattern Recognit 26:1975–1981 Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. IEEE Comput Vis Pattern Recognit 26:1975–1981
6.
Zurück zum Zitat Ji Z, Pang Y, Li X (2015) Relevance preserving projection and ranking for web image search reranking. IEEE Trans Image Process A Publ IEEE Signal Process Soc 24(11):4137–47MathSciNetMATH Ji Z, Pang Y, Li X (2015) Relevance preserving projection and ranking for web image search reranking. IEEE Trans Image Process A Publ IEEE Signal Process Soc 24(11):4137–47MathSciNetMATH
7.
Zurück zum Zitat Xiao Y, Wang H, Zhang L, Xu W (2014) Two methods of selecting Gaussian Kernel parameters for one-class svm and their application to fault detection. Knowl Based Syst 59(2):75–84CrossRef Xiao Y, Wang H, Zhang L, Xu W (2014) Two methods of selecting Gaussian Kernel parameters for one-class svm and their application to fault detection. Knowl Based Syst 59(2):75–84CrossRef
8.
Zurück zum Zitat Xiao Y, Wang H, Xu W, Zhou J (2016) Robust one-class svm for fault detection. Chemometr Intell Lab Syst 151:15–25CrossRef Xiao Y, Wang H, Xu W, Zhou J (2016) Robust one-class svm for fault detection. Chemometr Intell Lab Syst 151:15–25CrossRef
9.
Zurück zum Zitat Roberts S, Tarassenko L (1994) A probabilistic resource allocating network for novelty detection. Neural Comput 6(2):270–284CrossRef Roberts S, Tarassenko L (1994) A probabilistic resource allocating network for novelty detection. Neural Comput 6(2):270–284CrossRef
10.
Zurück zum Zitat Dasarathy BV (1998) Adaptive local fusion systems for novelty detection and diagnostics in condition monitoring. Proc SPIE Int Soc Opt Eng 3376:210–218 Dasarathy BV (1998) Adaptive local fusion systems for novelty detection and diagnostics in condition monitoring. Proc SPIE Int Soc Opt Eng 3376:210–218
11.
Zurück zum Zitat Manevitz L, Yousef M (2007) One-class document classification via Neural Networks. Elsevier, AmsterdamCrossRef Manevitz L, Yousef M (2007) One-class document classification via Neural Networks. Elsevier, AmsterdamCrossRef
12.
Zurück zum Zitat Scholkopf B, Platt JC, Shawetaylor J, Smola AJ, Williamson RC (2014) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471CrossRef Scholkopf B, Platt JC, Shawetaylor J, Smola AJ, Williamson RC (2014) Estimating the support of a high-dimensional distribution. Neural Comput 13(7):1443–1471CrossRef
13.
Zurück zum Zitat Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66CrossRef Tax DMJ, Duin RPW (2004) Support vector data description. Mach Learn 54(1):45–66CrossRef
14.
Zurück zum Zitat Leng Q, Qi H, Miao J, Zhu W, Su G (2015) One-class classification with extreme learning machine. In: Mathematical problems in engineering 1–11MathSciNetCrossRef Leng Q, Qi H, Miao J, Zhu W, Su G (2015) One-class classification with extreme learning machine. In: Mathematical problems in engineering 1–11MathSciNetCrossRef
15.
Zurück zum Zitat Kriegel HP, Hubert MS, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 444–452 Kriegel HP, Hubert MS, Zimek A (2008) Angle-based outlier detection in high-dimensional data. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 444–452
16.
Zurück zum Zitat Casale P, Pujol O, Radeva P (2014) Approximate polytope ensemble for one-class classification. Pattern Recognit 47(2):854–864CrossRef Casale P, Pujol O, Radeva P (2014) Approximate polytope ensemble for one-class classification. Pattern Recognit 47(2):854–864CrossRef
17.
Zurück zum Zitat Janakiraman VM, Nielsen D (2016) Anomaly detection in aviation data using extreme learning machines. In: International joint conference on neural networks, pp 1993–2000 Janakiraman VM, Nielsen D (2016) Anomaly detection in aviation data using extreme learning machines. In: International joint conference on neural networks, pp 1993–2000
18.
Zurück zum Zitat Breunig MM, Kriegel HP, Ng RT (2000) LOF: identifying density-based local outliers. In: ACM sigmod international conference on management of data, Vol 29, pp 93–104CrossRef Breunig MM, Kriegel HP, Ng RT (2000) LOF: identifying density-based local outliers. In: ACM sigmod international conference on management of data, Vol 29, pp 93–104CrossRef
19.
Zurück zum Zitat Tang J, Chen Z, Fu AW, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. Pacific Asia Conf Knowl Discov Data Min 2336:535–548CrossRef Tang J, Chen Z, Fu AW, Cheung DW (2002) Enhancing effectiveness of outlier detections for low density patterns. Pacific Asia Conf Knowl Discov Data Min 2336:535–548CrossRef
20.
Zurück zum Zitat Hautamaki V, Karkkainen I, Franti P (2004) Outlier Detection Using k-Nearest Neighbour Graph. In: International conference on pattern recognition, IEEE, Vol 3, pp 430–433 Hautamaki V, Karkkainen I, Franti P (2004) Outlier Detection Using k-Nearest Neighbour Graph. In: International conference on pattern recognition, IEEE, Vol 3, pp 430–433
21.
Zurück zum Zitat Pokrajac D, Lazarevic A, Latecki LJ (2007) Incremental local outlier detection for data streams. In: Computational intelligence and data mining, 2007, CIDM 2007, IEEE Symposium on, pp 504–515 Pokrajac D, Lazarevic A, Latecki LJ (2007) Incremental local outlier detection for data streams. In: Computational intelligence and data mining, 2007, CIDM 2007, IEEE Symposium on, pp 504–515
22.
Zurück zum Zitat Liu W, Hua G, Smith JR (2014) Unsupervised one-class learning for automatic outlier removal. In: IEEE conference on computer vision and pattern recognition, pp 3826–3833 Liu W, Hua G, Smith JR (2014) Unsupervised one-class learning for automatic outlier removal. In: IEEE conference on computer vision and pattern recognition, pp 3826–3833
23.
Zurück zum Zitat Grubbs F (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21CrossRef Grubbs F (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21CrossRef
24.
Zurück zum Zitat Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min 5(5):363–387MathSciNetCrossRef Zimek A, Schubert E, Kriegel HP (2012) A survey on unsupervised outlier detection in high-dimensional numerical data. Stat Anal Data Min 5(5):363–387MathSciNetCrossRef
25.
26.
Zurück zum Zitat Kim JS, Scott C (2008) Robust kernel density estimation. In: IEEE international conference on acoustics, speech and signal processing, vol 13, pp 2529–2565 Kim JS, Scott C (2008) Robust kernel density estimation. In: IEEE international conference on acoustics, speech and signal processing, vol 13, pp 2529–2565
27.
Zurück zum Zitat Karlpearson FRS (1901) Liii. on lines and planes of closest fit to systems of points in space. Philos Magn 2(11):559–572CrossRef Karlpearson FRS (1901) Liii. on lines and planes of closest fit to systems of points in space. Philos Magn 2(11):559–572CrossRef
28.
Zurück zum Zitat Schlkopf B, Smola A, Mller KR (1998) Nonlinear component analysis as a kernel eigen-value problem. Neuroimage 10:1299–1319 Schlkopf B, Smola A, Mller KR (1998) Nonlinear component analysis as a kernel eigen-value problem. Neuroimage 10:1299–1319
29.
Zurück zum Zitat Vidal R, Sapiro G, Elhamifar E (2012) See all by looking at a few: Sparse modeling for finding representative objects. IEEE Comput Vis Pattern Recognit 157:1600–1607 Vidal R, Sapiro G, Elhamifar E (2012) See all by looking at a few: Sparse modeling for finding representative objects. IEEE Comput Vis Pattern Recognit 157:1600–1607
30.
Zurück zum Zitat Xia Y, Cao X, Wen F, Hua G (2015) Learning discriminative reconstructions for unsupervised outlier removal. In: IEEE international conference on computer vision, pp 1511–1519 Xia Y, Cao X, Wen F, Hua G (2015) Learning discriminative reconstructions for unsupervised outlier removal. In: IEEE international conference on computer vision, pp 1511–1519
31.
Zurück zum Zitat Li S, Shao M, Fu Y (2014) Locality linear fitting one-class SVM with low-rank constraints for outlier detection. In: International joint conference on neural networks, IEEE, pp 676–683 Li S, Shao M, Fu Y (2014) Locality linear fitting one-class SVM with low-rank constraints for outlier detection. In: International joint conference on neural networks, IEEE, pp 676–683
32.
Zurück zum Zitat Li S, Shao M, Fu Y (2014) Low-rank outlier detection Li S, Shao M, Fu Y (2014) Low-rank outlier detection
33.
Zurück zum Zitat Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRef Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRef
34.
Zurück zum Zitat Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529CrossRef Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529CrossRef
35.
Zurück zum Zitat Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423CrossRef Liang NY, Huang GB, Saratchandran P, Sundararajan N (2006) A fast and accurate online sequential learning algorithm for feedforward networks. IEEE Trans Neural Netw 17(6):1411–1423CrossRef
36.
Zurück zum Zitat Huang G, Song S, Gupta JND, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44(12):2405–2417CrossRef Huang G, Song S, Gupta JND, Wu C (2014) Semi-supervised and unsupervised extreme learning machines. IEEE Trans Cybern 44(12):2405–2417CrossRef
37.
Zurück zum Zitat Cambria E, Liu Q, Li K, Leung VCM, Feng L, Ong YS et al (2013) Extreme learning machines: trends and controversies. IEEE Intell Syst 28(6):30–59CrossRef Cambria E, Liu Q, Li K, Leung VCM, Feng L, Ong YS et al (2013) Extreme learning machines: trends and controversies. IEEE Intell Syst 28(6):30–59CrossRef
38.
Zurück zum Zitat Wang Y, Xie Z, Xu K, Dou Y, Lei Y (2016) An efficient and effective convolutional auto-encoder extreme learning machine network for 3d feature learning. Neurocomputing 174(PB):988–998CrossRef Wang Y, Xie Z, Xu K, Dou Y, Lei Y (2016) An efficient and effective convolutional auto-encoder extreme learning machine network for 3d feature learning. Neurocomputing 174(PB):988–998CrossRef
39.
Zurück zum Zitat Bai Z, Huang GB (2015) Generic object recognition with local receptive fields based extreme learning machine. Proc Comput Sci 53(1):391–399CrossRef Bai Z, Huang GB (2015) Generic object recognition with local receptive fields based extreme learning machine. Proc Comput Sci 53(1):391–399CrossRef
40.
Zurück zum Zitat Decherchi S, Gastaldo P, Zunino R, Cambria E, Redi J (2013) Circular-elm for the reduced-reference assessment of perceived image quality. Neurocomputing 102(2):78–89CrossRef Decherchi S, Gastaldo P, Zunino R, Cambria E, Redi J (2013) Circular-elm for the reduced-reference assessment of perceived image quality. Neurocomputing 102(2):78–89CrossRef
41.
Zurück zum Zitat Choi K, Toh K-A, Byun H (2012) Incremental face recognition for large-scale social network services. Pattern Recognit 45(8):2868–2883CrossRef Choi K, Toh K-A, Byun H (2012) Incremental face recognition for large-scale social network services. Pattern Recognit 45(8):2868–2883CrossRef
42.
Zurück zum Zitat Xie Z, Kai X, Shan W, Liu L, Xiong Y, Huang H (2015) Projective feature learning for 3d shapes with multi-view depth images. Comput Graph Forum 34(7):1–11CrossRef Xie Z, Kai X, Shan W, Liu L, Xiong Y, Huang H (2015) Projective feature learning for 3d shapes with multi-view depth images. Comput Graph Forum 34(7):1–11CrossRef
43.
Zurück zum Zitat Wang S, Zhu E, Yin J, Porikli F (2017) Video anomaly detection and localization by local motion based joint video representation and OCELM. Neurocomputing 277:161–175CrossRef Wang S, Zhu E, Yin J, Porikli F (2017) Video anomaly detection and localization by local motion based joint video representation and OCELM. Neurocomputing 277:161–175CrossRef
44.
Zurück zum Zitat Tang J, Deng C, Huang GB (2017) Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 27(4):809–821MathSciNetCrossRef Tang J, Deng C, Huang GB (2017) Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 27(4):809–821MathSciNetCrossRef
45.
Zurück zum Zitat Zhang L, Deng P (2017) Abnormal odor detection in electronic nose via self-expression inspired extreme learning machine. IEEE Trans Syst Man Cybern Syst PP(99):1–11 Zhang L, Deng P (2017) Abnormal odor detection in electronic nose via self-expression inspired extreme learning machine. IEEE Trans Syst Man Cybern Syst PP(99):1–11
46.
Zurück zum Zitat Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: IEEE international conference on data mining, 2002. ICDM 2003. IEEE, Proceedings vol 156, pp 709–712 Williams G, Baxter R, He H, Hawkins S, Gu L (2002) A comparative study of RNN for outlier detection in data mining. In: IEEE international conference on data mining, 2002. ICDM 2003. IEEE, Proceedings vol 156, pp 709–712
47.
Zurück zum Zitat Ohtsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66CrossRef Ohtsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66CrossRef
48.
Zurück zum Zitat Dasgupta S (2013) Experiments with random projection. In: Proceedings of the sixteenth conference on uncertainty in artificial intelligence, pp 143–151 Dasgupta S (2013) Experiments with random projection. In: Proceedings of the sixteenth conference on uncertainty in artificial intelligence, pp 143–151
49.
Zurück zum Zitat Bingham E, Mannila H (2001) Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 245–250 Bingham E, Mannila H (2001) Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining, pp 245–250
50.
Zurück zum Zitat Xie H, Li J, Xue H (2017) A survey of dimensionality reduction techniques based on random projection. arXiv preprint arXiv:1706.04371 Xie H, Li J, Xue H (2017) A survey of dimensionality reduction techniques based on random projection. arXiv preprint arXiv:​1706.​04371
51.
Zurück zum Zitat Dasgupta S, Gupta A (2003) An elementary proof of a theorem of johnson and lindenstrauss. Random Struct Algorithm 22(1):60–65MathSciNetCrossRef Dasgupta S, Gupta A (2003) An elementary proof of a theorem of johnson and lindenstrauss. Random Struct Algorithm 22(1):60–65MathSciNetCrossRef
52.
Zurück zum Zitat Aggarwal C (2015) Outlier analysis. Springer, New YorkMATH Aggarwal C (2015) Outlier analysis. Springer, New YorkMATH
53.
Zurück zum Zitat Chandola V, Banerjee A, Kumar V (2009) Anomaly detection:a survey. ACM Comput Surv (CSUR) 41(3):1–58CrossRef Chandola V, Banerjee A, Kumar V (2009) Anomaly detection:a survey. ACM Comput Surv (CSUR) 41(3):1–58CrossRef
54.
Zurück zum Zitat Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101(3):229–242CrossRef Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101(3):229–242CrossRef
55.
Zurück zum Zitat Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Computer vision and pattern recognition, IEEE, vol 119, pp 3360–3367 Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y (2010) Locality-constrained linear coding for image classification. In: Computer vision and pattern recognition, IEEE, vol 119, pp 3360–3367
Metadaten
Titel
Fast and unsupervised outlier removal by recurrent adaptive reconstruction extreme learning machine
verfasst von
Wang Siqi
Liu Qiang
Guo Xifeng
Zhu En
Yin Jianping
Publikationsdatum
05.03.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 12/2019
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-019-00943-4

Weitere Artikel der Ausgabe 12/2019

International Journal of Machine Learning and Cybernetics 12/2019 Zur Ausgabe

Neuer Inhalt