Skip to main content
Erschienen in: Artificial Intelligence Review 1/2020

03.01.2019

Detecting anomalies in sequential data augmented with new features

verfasst von: Xiangzeng Kong, Yaxin Bi, David H. Glass

Erschienen in: Artificial Intelligence Review | Ausgabe 1/2020

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper presents a new weighted local outlier factor method for anomaly detection, which is underpinned with three novel components: (1) a piecewise linear representation defined on the basis of the important points that consist of extreme points and additional points; (2) a set of new features which are used to identify anomalies given the new piecewise linear representation; (3) a weighting schema, assigning different weights to different features by accounting for the discriminant power of the features. The underlying idea of the proposed method is to characterize a time series with a set of four features and then discover abnormal changes by taking account of the closeness of any data points augmented with the new features. The comparative experiments demonstrate that the proposed piecewise representation method has performed well in sequential time series data, and the weighted local outlier factor method has achieved better accuracy and RankPower in detecting anomalies from the same data sets in comparison with the conventional local outlier factor, normalized local outlier factor and HOT symbolic aggregate approximation methods.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aydin I, Karakose M, Akin E (2015) Anomaly detection using a modified kernel-based tracking in the pantograph-catenary system. Expert Syst Appl 42(2015):938–948CrossRef Aydin I, Karakose M, Akin E (2015) Anomaly detection using a modified kernel-based tracking in the pantograph-catenary system. Expert Syst Appl 42(2015):938–948CrossRef
Zurück zum Zitat Beigi MS, Chang SF, Ebadollahi S, Verma DC (2011) Anomaly detection in information streams without prior domain knowledge. IBM J Res Dev 55(5):1–11CrossRef Beigi MS, Chang SF, Ebadollahi S, Verma DC (2011) Anomaly detection in information streams without prior domain knowledge. IBM J Res Dev 55(5):1–11CrossRef
Zurück zum Zitat Breunig MM, Kriegel H-P, Ng RN, Sander J (2000) LOF: identifying density-based local outliers. In: Proceeding SIGMOD’00 proceedings of the 2000 ACM SIGMOD international conference on management of data, vol 29(2). ACM, New York, pp 93–104 Breunig MM, Kriegel H-P, Ng RN, Sander J (2000) LOF: identifying density-based local outliers. In: Proceeding SIGMOD’00 proceedings of the 2000 ACM SIGMOD international conference on management of data, vol 29(2). ACM, New York, pp 93–104
Zurück zum Zitat Chandola V, Boriah S, Kumar V (2008a) Understanding categorical similarity measures for outlier detection. Technical report 08-008, University of Minnesota, pp 1–45 Chandola V, Boriah S, Kumar V (2008a) Understanding categorical similarity measures for outlier detection. Technical report 08-008, University of Minnesota, pp 1–45
Zurück zum Zitat Chandola V, Mithal V, Kumar V (2008b) A comparative evaluation of anomaly detection techniques for sequence data. In: ICDM, pp 743–748 Chandola V, Mithal V, Kumar V (2008b) A comparative evaluation of anomaly detection techniques for sequence data. In: ICDM, pp 743–748
Zurück zum Zitat Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58CrossRef Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58CrossRef
Zurück zum Zitat Gupta M, Gao J, Aggarwal CC, Han J (2014) Outlier detection for temporal data: a survey. IEEE Trans Knowl Data Eng 26(9):2250–2267CrossRef Gupta M, Gao J, Aggarwal CC, Han J (2014) Outlier detection for temporal data: a survey. IEEE Trans Knowl Data Eng 26(9):2250–2267CrossRef
Zurück zum Zitat Hadi AS (1994) A modification of a method for the detection of outliers in multivariate samples. J R Stat Soc B 56(2):393–396MATH Hadi AS (1994) A modification of a method for the detection of outliers in multivariate samples. J R Stat Soc B 56(2):393–396MATH
Zurück zum Zitat Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef
Zurück zum Zitat Huang H (2013) Rank based anomaly detection algorithms. Dissertations, Electrical Engineering and Computer Science, pp 1–182 Huang H (2013) Rank based anomaly detection algorithms. Dissertations, Electrical Engineering and Computer Science, pp 1–182
Zurück zum Zitat Jin XH, Sun Y, Que ZJ, Wang Y, Chow WS (2016) Anomaly detection and fault prognosis for bearings. IEEE Trans Instrum Meas 65(9):2046–2054CrossRef Jin XH, Sun Y, Que ZJ, Wang Y, Chow WS (2016) Anomaly detection and fault prognosis for bearings. IEEE Trans Instrum Meas 65(9):2046–2054CrossRef
Zurück zum Zitat Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 151–162 Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 151–162
Zurück zum Zitat Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter-free data mining. KDD, Seattle, Washington, DC, pp 206–215 Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter-free data mining. KDD, Seattle, Washington, DC, pp 206–215
Zurück zum Zitat Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. In: ICDM, pp 226–233 Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. In: ICDM, pp 226–233
Zurück zum Zitat Keogh E, Chakrabarti K, Pazzani MJ, Mehrotra S (2008) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–268CrossRef Keogh E, Chakrabarti K, Pazzani MJ, Mehrotra S (2008) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–268CrossRef
Zurück zum Zitat Kou Y, Lu CT, Chen D (2006) Spatial weighted outlier detection. In: Proceedings of the SIAM conference on data mining, pp 614–617 Kou Y, Lu CT, Chen D (2006) Spatial weighted outlier detection. In: Proceedings of the SIAM conference on data mining, pp 614–617
Zurück zum Zitat Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11 Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11
Zurück zum Zitat Palpanas T, Vlachos M, Keogh E, Gunopulos D, Truppel W (2004) Online amnesic approximation of streaming time series. In: ICDE, Boston, March 2004 Palpanas T, Vlachos M, Keogh E, Gunopulos D, Truppel W (2004) Online amnesic approximation of streaming time series. In: ICDE, Boston, March 2004
Zurück zum Zitat Park S, Kim SW, Cho JS, Padmanabhan S (2001a) Prefix-querying: an approach for effective subsequence matching under time warping in sequence databases. In: Proceedings of the 10th international conference on information and knowledge management, pp 255–262 Park S, Kim SW, Cho JS, Padmanabhan S (2001a) Prefix-querying: an approach for effective subsequence matching under time warping in sequence databases. In: Proceedings of the 10th international conference on information and knowledge management, pp 255–262
Zurück zum Zitat Park S, Kim SW, Chu WW (2001b) Segment-based approach for subsequence searches in sequence databases. In: Proceedings of the 16th ACM symposium on applied computing, pp 248–252 Park S, Kim SW, Chu WW (2001b) Segment-based approach for subsequence searches in sequence databases. In: Proceedings of the 16th ACM symposium on applied computing, pp 248–252
Zurück zum Zitat Peng CS, Wang H, Zhang SR, Parker DS (2000) Landmarks: a new model for similarity-based pattern querying in time series databases. In: Proceedings of the 16th international conference on data engineering, pp 33–42 Peng CS, Wang H, Zhang SR, Parker DS (2000) Landmarks: a new model for similarity-based pattern querying in time series databases. In: Proceedings of the 16th international conference on data engineering, pp 33–42
Zurück zum Zitat Pratt KB, Fink E (2002) Search for patterns in compressed time series. Int J Image Graph 2(1):89–106CrossRef Pratt KB, Fink E (2002) Search for patterns in compressed time series. Int J Image Graph 2(1):89–106CrossRef
Zurück zum Zitat Ramaswamy S, Rastogi R, Kyuseok S (2000) Efficient algorithms for mining outliers from large data sets. In: Proceeding ACMSIGMOD international conference on management of data, pp 427–438 Ramaswamy S, Rastogi R, Kyuseok S (2000) Efficient algorithms for mining outliers from large data sets. In: Proceeding ACMSIGMOD international conference on management of data, pp 427–438
Zurück zum Zitat Sun J, Qu H, Chakrabarti D, Faloutsos C (2005) Neighborhood formation and anomaly detection in bipartite graphs. In: Proceedings of the 5th IEEE international conference on data mining. IEEE Computer Society, pp 418–425 Sun J, Qu H, Chakrabarti D, Faloutsos C (2005) Neighborhood formation and anomaly detection in bipartite graphs. In: Proceedings of the 5th IEEE international conference on data mining. IEEE Computer Society, pp 418–425
Zurück zum Zitat Tandon G, Chan P (2007) Weighting versus pruning in rule validation for detecting network and host anomalies. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 697–706 Tandon G, Chan P (2007) Weighting versus pruning in rule validation for detecting network and host anomalies. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 697–706
Zurück zum Zitat Weigend AS, Mangeas M, Srivastava AN (1995) Nonlinear gated experts for time-series: discovering regimes and avoiding overfitting. Int J Neural Syst 6(4):373–399CrossRef Weigend AS, Mangeas M, Srivastava AN (1995) Nonlinear gated experts for time-series: discovering regimes and avoiding overfitting. Int J Neural Syst 6(4):373–399CrossRef
Zurück zum Zitat Yan C, Fang J, Wu L, Ma S (2013) An approach of time series piecewise linear representation based on local maximum minimum and extremum. J Inf Comput Sci 10(9):2747–2756CrossRef Yan C, Fang J, Wu L, Ma S (2013) An approach of time series piecewise linear representation based on local maximum minimum and extremum. J Inf Comput Sci 10(9):2747–2756CrossRef
Zurück zum Zitat Yankov D, Keogh E, Rebbapragada U (2007) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. In: ICDM 2007 Yankov D, Keogh E, Rebbapragada U (2007) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. In: ICDM 2007
Zurück zum Zitat Zhang Y, Meratnia N, Havinga PJM (2008) Outlier detection techniques for wireless sensor networks: a survey. Technical Report, Centre Telemat. Inform. Technol. Univ. Twente, Enschede, TR-CTIT-08-59, pp 159–170 Zhang Y, Meratnia N, Havinga PJM (2008) Outlier detection techniques for wireless sensor networks: a survey. Technical Report, Centre Telemat. Inform. Technol. Univ. Twente, Enschede, TR-CTIT-08-59, pp 159–170
Metadaten
Titel
Detecting anomalies in sequential data augmented with new features
verfasst von
Xiangzeng Kong
Yaxin Bi
David H. Glass
Publikationsdatum
03.01.2019
Verlag
Springer Netherlands
Erschienen in
Artificial Intelligence Review / Ausgabe 1/2020
Print ISSN: 0269-2821
Elektronische ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-018-9671-x

Weitere Artikel der Ausgabe 1/2020

Artificial Intelligence Review 1/2020 Zur Ausgabe