Skip to main content
Top
Published in: Artificial Intelligence Review 1/2020

03-01-2019

Detecting anomalies in sequential data augmented with new features

Authors: Xiangzeng Kong, Yaxin Bi, David H. Glass

Published in: Artificial Intelligence Review | Issue 1/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper presents a new weighted local outlier factor method for anomaly detection, which is underpinned with three novel components: (1) a piecewise linear representation defined on the basis of the important points that consist of extreme points and additional points; (2) a set of new features which are used to identify anomalies given the new piecewise linear representation; (3) a weighting schema, assigning different weights to different features by accounting for the discriminant power of the features. The underlying idea of the proposed method is to characterize a time series with a set of four features and then discover abnormal changes by taking account of the closeness of any data points augmented with the new features. The comparative experiments demonstrate that the proposed piecewise representation method has performed well in sequential time series data, and the weighted local outlier factor method has achieved better accuracy and RankPower in detecting anomalies from the same data sets in comparison with the conventional local outlier factor, normalized local outlier factor and HOT symbolic aggregate approximation methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Aydin I, Karakose M, Akin E (2015) Anomaly detection using a modified kernel-based tracking in the pantograph-catenary system. Expert Syst Appl 42(2015):938–948CrossRef Aydin I, Karakose M, Akin E (2015) Anomaly detection using a modified kernel-based tracking in the pantograph-catenary system. Expert Syst Appl 42(2015):938–948CrossRef
go back to reference Beigi MS, Chang SF, Ebadollahi S, Verma DC (2011) Anomaly detection in information streams without prior domain knowledge. IBM J Res Dev 55(5):1–11CrossRef Beigi MS, Chang SF, Ebadollahi S, Verma DC (2011) Anomaly detection in information streams without prior domain knowledge. IBM J Res Dev 55(5):1–11CrossRef
go back to reference Breunig MM, Kriegel H-P, Ng RN, Sander J (2000) LOF: identifying density-based local outliers. In: Proceeding SIGMOD’00 proceedings of the 2000 ACM SIGMOD international conference on management of data, vol 29(2). ACM, New York, pp 93–104 Breunig MM, Kriegel H-P, Ng RN, Sander J (2000) LOF: identifying density-based local outliers. In: Proceeding SIGMOD’00 proceedings of the 2000 ACM SIGMOD international conference on management of data, vol 29(2). ACM, New York, pp 93–104
go back to reference Chandola V, Boriah S, Kumar V (2008a) Understanding categorical similarity measures for outlier detection. Technical report 08-008, University of Minnesota, pp 1–45 Chandola V, Boriah S, Kumar V (2008a) Understanding categorical similarity measures for outlier detection. Technical report 08-008, University of Minnesota, pp 1–45
go back to reference Chandola V, Mithal V, Kumar V (2008b) A comparative evaluation of anomaly detection techniques for sequence data. In: ICDM, pp 743–748 Chandola V, Mithal V, Kumar V (2008b) A comparative evaluation of anomaly detection techniques for sequence data. In: ICDM, pp 743–748
go back to reference Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58CrossRef Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58CrossRef
go back to reference Gupta M, Gao J, Aggarwal CC, Han J (2014) Outlier detection for temporal data: a survey. IEEE Trans Knowl Data Eng 26(9):2250–2267CrossRef Gupta M, Gao J, Aggarwal CC, Han J (2014) Outlier detection for temporal data: a survey. IEEE Trans Knowl Data Eng 26(9):2250–2267CrossRef
go back to reference Hadi AS (1994) A modification of a method for the detection of outliers in multivariate samples. J R Stat Soc B 56(2):393–396MATH Hadi AS (1994) A modification of a method for the detection of outliers in multivariate samples. J R Stat Soc B 56(2):393–396MATH
go back to reference Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126CrossRef
go back to reference Huang H (2013) Rank based anomaly detection algorithms. Dissertations, Electrical Engineering and Computer Science, pp 1–182 Huang H (2013) Rank based anomaly detection algorithms. Dissertations, Electrical Engineering and Computer Science, pp 1–182
go back to reference Jin XH, Sun Y, Que ZJ, Wang Y, Chow WS (2016) Anomaly detection and fault prognosis for bearings. IEEE Trans Instrum Meas 65(9):2046–2054CrossRef Jin XH, Sun Y, Que ZJ, Wang Y, Chow WS (2016) Anomaly detection and fault prognosis for bearings. IEEE Trans Instrum Meas 65(9):2046–2054CrossRef
go back to reference Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 151–162 Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 151–162
go back to reference Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter-free data mining. KDD, Seattle, Washington, DC, pp 206–215 Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter-free data mining. KDD, Seattle, Washington, DC, pp 206–215
go back to reference Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. In: ICDM, pp 226–233 Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. In: ICDM, pp 226–233
go back to reference Keogh E, Chakrabarti K, Pazzani MJ, Mehrotra S (2008) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–268CrossRef Keogh E, Chakrabarti K, Pazzani MJ, Mehrotra S (2008) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–268CrossRef
go back to reference Kou Y, Lu CT, Chen D (2006) Spatial weighted outlier detection. In: Proceedings of the SIAM conference on data mining, pp 614–617 Kou Y, Lu CT, Chen D (2006) Spatial weighted outlier detection. In: Proceedings of the SIAM conference on data mining, pp 614–617
go back to reference Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11 Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11
go back to reference Palpanas T, Vlachos M, Keogh E, Gunopulos D, Truppel W (2004) Online amnesic approximation of streaming time series. In: ICDE, Boston, March 2004 Palpanas T, Vlachos M, Keogh E, Gunopulos D, Truppel W (2004) Online amnesic approximation of streaming time series. In: ICDE, Boston, March 2004
go back to reference Park S, Kim SW, Cho JS, Padmanabhan S (2001a) Prefix-querying: an approach for effective subsequence matching under time warping in sequence databases. In: Proceedings of the 10th international conference on information and knowledge management, pp 255–262 Park S, Kim SW, Cho JS, Padmanabhan S (2001a) Prefix-querying: an approach for effective subsequence matching under time warping in sequence databases. In: Proceedings of the 10th international conference on information and knowledge management, pp 255–262
go back to reference Park S, Kim SW, Chu WW (2001b) Segment-based approach for subsequence searches in sequence databases. In: Proceedings of the 16th ACM symposium on applied computing, pp 248–252 Park S, Kim SW, Chu WW (2001b) Segment-based approach for subsequence searches in sequence databases. In: Proceedings of the 16th ACM symposium on applied computing, pp 248–252
go back to reference Peng CS, Wang H, Zhang SR, Parker DS (2000) Landmarks: a new model for similarity-based pattern querying in time series databases. In: Proceedings of the 16th international conference on data engineering, pp 33–42 Peng CS, Wang H, Zhang SR, Parker DS (2000) Landmarks: a new model for similarity-based pattern querying in time series databases. In: Proceedings of the 16th international conference on data engineering, pp 33–42
go back to reference Pratt KB, Fink E (2002) Search for patterns in compressed time series. Int J Image Graph 2(1):89–106CrossRef Pratt KB, Fink E (2002) Search for patterns in compressed time series. Int J Image Graph 2(1):89–106CrossRef
go back to reference Ramaswamy S, Rastogi R, Kyuseok S (2000) Efficient algorithms for mining outliers from large data sets. In: Proceeding ACMSIGMOD international conference on management of data, pp 427–438 Ramaswamy S, Rastogi R, Kyuseok S (2000) Efficient algorithms for mining outliers from large data sets. In: Proceeding ACMSIGMOD international conference on management of data, pp 427–438
go back to reference Sun J, Qu H, Chakrabarti D, Faloutsos C (2005) Neighborhood formation and anomaly detection in bipartite graphs. In: Proceedings of the 5th IEEE international conference on data mining. IEEE Computer Society, pp 418–425 Sun J, Qu H, Chakrabarti D, Faloutsos C (2005) Neighborhood formation and anomaly detection in bipartite graphs. In: Proceedings of the 5th IEEE international conference on data mining. IEEE Computer Society, pp 418–425
go back to reference Tandon G, Chan P (2007) Weighting versus pruning in rule validation for detecting network and host anomalies. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 697–706 Tandon G, Chan P (2007) Weighting versus pruning in rule validation for detecting network and host anomalies. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 697–706
go back to reference Weigend AS, Mangeas M, Srivastava AN (1995) Nonlinear gated experts for time-series: discovering regimes and avoiding overfitting. Int J Neural Syst 6(4):373–399CrossRef Weigend AS, Mangeas M, Srivastava AN (1995) Nonlinear gated experts for time-series: discovering regimes and avoiding overfitting. Int J Neural Syst 6(4):373–399CrossRef
go back to reference Yan C, Fang J, Wu L, Ma S (2013) An approach of time series piecewise linear representation based on local maximum minimum and extremum. J Inf Comput Sci 10(9):2747–2756CrossRef Yan C, Fang J, Wu L, Ma S (2013) An approach of time series piecewise linear representation based on local maximum minimum and extremum. J Inf Comput Sci 10(9):2747–2756CrossRef
go back to reference Yankov D, Keogh E, Rebbapragada U (2007) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. In: ICDM 2007 Yankov D, Keogh E, Rebbapragada U (2007) Disk aware discord discovery: finding unusual time series in terabyte sized datasets. In: ICDM 2007
go back to reference Zhang Y, Meratnia N, Havinga PJM (2008) Outlier detection techniques for wireless sensor networks: a survey. Technical Report, Centre Telemat. Inform. Technol. Univ. Twente, Enschede, TR-CTIT-08-59, pp 159–170 Zhang Y, Meratnia N, Havinga PJM (2008) Outlier detection techniques for wireless sensor networks: a survey. Technical Report, Centre Telemat. Inform. Technol. Univ. Twente, Enschede, TR-CTIT-08-59, pp 159–170
Metadata
Title
Detecting anomalies in sequential data augmented with new features
Authors
Xiangzeng Kong
Yaxin Bi
David H. Glass
Publication date
03-01-2019
Publisher
Springer Netherlands
Published in
Artificial Intelligence Review / Issue 1/2020
Print ISSN: 0269-2821
Electronic ISSN: 1573-7462
DOI
https://doi.org/10.1007/s10462-018-9671-x

Other articles of this Issue 1/2020

Artificial Intelligence Review 1/2020 Go to the issue

Premium Partner