skip to main content
survey

Anomaly Detection Methods for Categorical Data: A Review

Authors Info & Claims
Published:30 May 2019Publication History
Skip Abstract Section

Abstract

Anomaly detection has numerous applications in diverse fields. For example, it has been widely used for discovering network intrusions and malicious events. It has also been used in numerous other applications such as identifying medical malpractice or credit fraud. Detection of anomalies in quantitative data has received a considerable attention in the literature and has a venerable history. By contrast, and despite the widespread availability use of categorical data in practice, anomaly detection in categorical data has received relatively little attention as compared to quantitative data. This is because detection of anomalies in categorical data is a challenging problem. Some anomaly detection techniques depend on identifying a representative pattern then measuring distances between objects and this pattern. Objects that are far from this pattern are declared as anomalies. However, identifying patterns and measuring distances are not easy in categorical data compared with quantitative data. Fortunately, several papers focussing on the detection of anomalies in categorical data have been published in the recent literature. In this article, we provide a comprehensive review of the research on the anomaly detection problem in categorical data. Previous review articles focus on either the statistics literature or the machine learning and computer science literature. This review article combines both literatures. We review 36 methods for the detection of anomalies in categorical data in both literatures and classify them into 12 different categories based on the conceptual definition of anomalies they use. For each approach, we survey anomaly detection methods, and then show the similarities and differences among them. We emphasize two important issues, the number of parameters each method requires and its time complexity. The first issue is critical, because the performance of these methods are sensitive to the choice of these parameters. The time complexity is also very important in real applications especially in big data applications. We report the time complexity if it is reported by the authors of the methods. If it is not, then we derive it ourselves and report it in this article. In addition, we discuss the common problems and the future directions of the anomaly detection in categorical data.

References

  1. Abror Abduvaliyev, Al-Sakib Khan Pathan, Jianying Zhou, Rodrigo Roman, and Wai-Choong Wong. 2013. On the vital areas of intrusion detection systems in wireless sensor networks. IEEE Commun. Surveys Tutor. 15, 3 (2013), 1223--1237.Google ScholarGoogle ScholarCross RefCross Ref
  2. Hala Abukhalaf, Jianxin Wang, and Shigeng Zhang. 2015. Outlier detection techniques for localization in wireless sensor networks: A survey. Int. J. Future Gen. Commun. Netw. 8, 6 (2015), 99--114.Google ScholarGoogle Scholar
  3. Charu C. Aggarwal. 2017. Outlier Analysis, 2nd ed. Springer, Cham. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Charu C. Aggarwal and Philip S. Yu. 2001. Outlier detection for high dimensional data. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’01). 37--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Charu C. Aggarwal, Yuchen Zhao, and Philip S. Yu. 2011. Outlier detection in graph streams. In Proceedings of the ACM IEEE International Conference on Data Engineering (ICDE’11). 399--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of International Conference on Very Large Data Bases (VLDB’94). 487--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Agresti. 2010. Analysis of Ordinal Categorical Data (2nd ed.). John Wiley 8 Sons, New York, NY.Google ScholarGoogle Scholar
  8. A. Agresti. 2013. Categorical Data Analysis (3rd ed.). John Wiley 8 Sons, New York, NY.Google ScholarGoogle Scholar
  9. Malik Agyemang, Ken Barker, and Rada Alhajj. 2006. A comprehensive survey of numeric and symbolic outlier mining techniques. Intell. Data Anal. 10(6) (2006), 521--538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mohiuddin Ahmed, Abdun Naser Mahmood, and Jiankun Hu. 2016. A survey of network anomaly detection techniques. Netw. Comput. Appl. 60 (2016), 19--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mohiuddin Ahmed, Abdun Naser Mahmood, and Md. Rafiqul Islam. 2016. A survey of anomaly detection techniques in financial domain. Future Gen. Comput. Syst. 55 (2016), 278--288. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Ajitha and E. Chandra. 2015. A survey on outliers detection in distributed data mining for big data. J. Basic Appl. Sci. Res. 5, 2 (2015), 31--38.Google ScholarGoogle Scholar
  13. Leman Akoglu, Mary Mcglohon, and Christos Faloutsos. 2010. OddBall: Spotting anomalies in weighted graphs. In Proceedings of the Pacific Asia Knowledge Discovery and Data Mining (PAKDD’10). 420--431. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Leman Akoglu, Hanghang Tong, and Danai Koutra. 2015. Graph-based anomaly detection and description: A survey. Data Min. Knowl. Discov. 29, 3 (2015), 626--688. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Leman Akoglu, Hanghang Tong, Jilles Vreeken, and Christos Faloutsos. 2012. Fast and reliable anomaly detection in categorical data. In Proceedings of the ACM International Conference on Information and Knowledge Management, (CIKM’12). 415--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Fabrizio Angiulli, Stefano Basta, and Clara Pizzuti. 2006. Distance-based detection and prediction of outliers. IEEE Trans. Knowl. Data Eng. 18(2) (2006), 145--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Fabrizio Angiulli and Fabio Fassetti. 2002. Fast outlier detection in high dimensional spaces. In Proceedings of the European Conference on the Principles of Data Mining and Knowledge Discovery. 19--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yagnik N. Ankur and Ajay Shanker Singh. 2014. Oulier analysis using frequent pattern mining: A review. Int. J. Comput. Sci. Info. Technol. 5, 1 (2014), 47--50.Google ScholarGoogle Scholar
  19. N. Archana and S. S. Pawar. 2014. Survey on outlier pattern detection techniques for time-series data. Int. J. Sc. Res. 1, 1 (2014), 1852--1856.Google ScholarGoogle Scholar
  20. Tony Bailetti, Mahmoud Gad, and Ahmed Shah. 2016. Intrusion learning: An overview of an emergent discipline. Technol. Innovat. Manage. Rev. 6, 2 (2016), 15--20.Google ScholarGoogle ScholarCross RefCross Ref
  21. U. A. B. U. A. Bakar, Hemant Ghayvat, S. F. Hasanm, and S. C. Mukhopadhyay. 2016. Activity and anomaly detection in smart home: A survey. In Next Generation Sensors and Systems, Subhas Chandra Mukhopadhyay (Ed.). Springer, New York, NY, Chapter 9, 191--220.Google ScholarGoogle Scholar
  22. Zuriana Abu Bakar, Rosmayati Mohemad, Akbar Ahmad, and Mustafa Mat Deris. 2006. A comparative study for outlier detection techniques in data mining. In Proceedings of IEEE International Conference on Cybernetics and Intelligent Systems. 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  23. V. Barnett and T. Lewis. 1994. Outliers in Statistical Data (3rd ed.). John Wiley 8 Sons, New York, NY.Google ScholarGoogle Scholar
  24. S. Bay and M. Schwabacher. 2003. Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, SIGKDD. 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Eric J. Beh. 2008. Simple correspondence analysis of nominal-ordinal contingency tables.J. Appl. Math. Decis. Sci. 228 (2008), 1--17.Google ScholarGoogle ScholarCross RefCross Ref
  26. Alka P. Beldar and Vinod S. Wadne. 2015. The detail survey of anomaly/outlier detection methods in data mining. Int. J. Multidisc. Curr. Res. 3 (2015), 462--472.Google ScholarGoogle Scholar
  27. Clauber Gomes Bezerra, Bruno Sielly Jales Costa, Luiz Affonso Guedes, and Plamen Parvanov Angelov. 2015. A comparative study of autonomous learning outlier detection methods applied to fault detection. In Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE’15). 1--7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kanishka Bhaduri, Bryan L. Matthews, and Chris R. Giannella. 2011. Algorithms for speeding up distance-based outlier detection. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, (SIGKDD’11). 895--867. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Umale Bhagyashree and M. Nilav. 2014. Overview of k-means and expectation maximization algorithm for document clustering. In Proceedings of the International Conference on Quality Up-gradation in Engineering, Science and Technology (ICQUEST’14). 5--8.Google ScholarGoogle Scholar
  30. N. Billor, Ali S. Hadi, and P. Velleman. 2000. Blocked adaptive computationally-efficient outlier nominators. Comput. Stat. Data Anal. 34 (2000), 279--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Christian Böhm, Katrin Haegler, Nikola S Müller, and Claudia Plant. 2009. CoCo: Coding cost for parameter-free outlier detection. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, (SIGKDD’09). 149--158. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shyam Boriah, Varun Chandola, and Vipin Kumar. 2008. Similarity measures for categorical data: A comparative evaluation. In Proceedings of the International SIAM Data Mining Conference (SDM’08). 243--254.Google ScholarGoogle ScholarCross RefCross Ref
  33. Mohamed Bouguessa. 2014. A mixture model-based combination approach for outlier detection. Int. J. Artific. Intell. Tools 23, 4 (2014), 1--21.Google ScholarGoogle Scholar
  34. Mohamed Bouguessa. 2015. A practical outlier detection approach for mixed-attribute data. Expert Syst. Appl. 42 (2015), 8637--8649. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. M. Breunig, H. Kriegel, R. T. Ng, and J. Sander. 2000. LOF: Identifying density--based local outliers. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’00). 93--104. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Guilherme O Campos, Arthur Zimek, Jörg Sander, Ricardo JGB Campello, Barbora Micenková, Erich Schubert, Ira Assent, and Michael E Houle. 2016. On the evaluation of unsupervised outlier detection: Measures, datasets, and an empirical study. Data Min. Knowl. Discov. 30, 4 (2016), 891--927. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. E. Castillo, J. M. Gutiérrez, and A. S. Hadi. 1997. Expert Systems and Probabilistic Network Models. Springer-Verlag, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. V. Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM Comput. Surveys 41(3) (2009), 1--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. V. Chandola, A. Banerjee, and V. Kumar. 2012. Anomaly detection for discrete sequences: A survey. Trans. Knowl. Data Eng. 24(5) (2012), 823--839. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. V. Chandola, S. Boriah, and V. Kumar. 2008. Understanding Categorical Similarity Measures for Outlier Detection. Technical Report. University of Minnesota, Department of Computer Science and Engineering, 1-46.Google ScholarGoogle Scholar
  41. V. Chandola, S. Boriah, and V. Kumar. 2009. A framework for exploring categorical data. In Proceedings of the International SIAM Data Mining Conference (SDM’09). 187--198.Google ScholarGoogle Scholar
  42. S. Chatterjee and Ali S. Hadi. 1986. Influential observations, high leverage points, and outliers in regression. Stat. Sci. 1 (1986), 379--416.Google ScholarGoogle ScholarCross RefCross Ref
  43. S. Chatterjee and Ali S. Hadi. 1988. Sensitivity Analysis in Linear Regression. John Wiley 8 Sons, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Sanjay Chawla and Pei Sun. 2006. SLOM: A new measure for local spatial outliers. Knowl. Info. Syst. 9 (2006), 412--429.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Haibin Cheng, Pang-Ning Tan, Christopher Potter, and Steven A. Klooster. 2009. Detection and characterization of anomalies in multivariate time series. In Proceedings of the SIAM International Conference on Data Mining (SDM’09). 413--424.Google ScholarGoogle Scholar
  46. HyungJun Cho and Soo-Heang Eo. 2016. Outlier detection for mass spectrometric data. In Statistical Analysis in Proteomics, Klaus Jung (Ed.). Springer, New York, NY, Chapter 5, 91--102.Google ScholarGoogle Scholar
  47. Gregory F. Cooper. 1990. The computational complexity of probabilistic inference using Bayesian belief networks. Artific. Intell. 42 (1990), 393--405. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Denis Cousineau and Sylvain Chartier. 2015. Outliers detection and treatment: A review. Int. J. Psychol. Res. 3, 1 (2015), 58--67.Google ScholarGoogle ScholarCross RefCross Ref
  49. J. Vijay Daniel, S. Joshna, and P. Manjula. 2013. A survey of various intrusion detection techniques in wireless sensor networks. Int. J. Comput. Sci. Mobile Comput. 2, 9 (2013), 235--246.Google ScholarGoogle Scholar
  50. K. Das and J. Schneider. 2007. Detecting anomalous records in categorical datasets. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD’07). 220--229. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. K. Das, J. Schneider, and D. B. Neill. 2008. Anomaly pattern detection in categorical datasets. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD’08). 169--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Dhwani Dave and Tanvi Varma. 2014. A review of various statistical methods for outlier detection. Int. J. Comput. Sci. Eng. Technol. 5, 2 (2014), 137--140.Google ScholarGoogle Scholar
  53. Herv Debar, Marc Dacier, and Andreas Wespi. 1999. Towards a taxonomy of intrusion-detection systems. Comput. Netw. 31, 9 (1999), 805--822. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Alfonso Iodice D’Enza and Michael Greenacre. 2012. Multiple correspondence analysis for the quantification and visualization of large categorical data sets. In Advanced Statistical Methods for the Analysis of Large Data-Sets, Agostino Di Ciaccio, Mauro Coli, and Jose Miguel Angulo Ibañez (Eds.). Springer, 453--463.Google ScholarGoogle Scholar
  55. Mr. Mukesh K. Deshmukh and A. S. Kapse. 2016. A survey on outlier detection technique in streaming data using data clustering approach. Int. J. Engineering and Computer Science 5, 1 (2016), 15453--15456.Google ScholarGoogle Scholar
  56. Christian Desrosiers and George Karypis. 2011. A comprehensive survey of neighborhood-based recommendation methods. In Recommender Systems Handbook. Springer-Verlag New York, NY, 107--144.Google ScholarGoogle Scholar
  57. R. Lakshmi Devi and R. Amalraj. 2015. Hubness in unsupervised outlier detection techniques for high dimensional data--A survey. Int. J. Comput. Appl. Technol. Res. 4, 11 (2015), 797--801.Google ScholarGoogle Scholar
  58. Jiten Harishbhai Dhimmar and Raksha Chauhan. 2014. A survey on profile-injection attacks in recommender systems using outlier analysis. Int. J. Adv. Res. Comput. Sci. Manage. Studies 2, 12 (2014), 356--359.Google ScholarGoogle Scholar
  59. Xuemei Ding, Yuhua Li, Ammar Belatreche, and Liam P. Maguire. 2014. An experimental evaluation of novelty detection methods. Neurocomputing 135 (2014), 313--327. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. K. T. Divya and N. S. Kumaran. 2016. Survey on outlier detection techniques using categorical data. Int. Res. J. Eng. Technol. 3 (2016), 899--904.Google ScholarGoogle Scholar
  61. Paul Dokas, Levent Ertoz, Vipin Kumar, Aleksandar Lazarevic, Jaideep Srivastava, and Pang-Ning Tan. 2002. Data mining for network intrusion detection. In Proceedings of the NSF Workshop on Next Generation Data Mining. 21--30.Google ScholarGoogle Scholar
  62. Jin Du, Qinghua Zheng, Haifei Li, and Wenbin Yuan. 2005. The research of mining association rules between personality and behavior of learner under web-based learning environment. In Proceedings of the the International Conference on Advances in Web-Based Learning (ICWL’05). 15--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. David Ebdon. 1991. Statistics in Geography: A Practical Approach-Revised with 17 Programs. Wiley-Blackwell, Hoboken, NJ.Google ScholarGoogle Scholar
  64. Syed Masum Emran and Nong Ye. 2001. Robustness of Canberra metric in computer intrusion detection. In Proceedings of the IEEE Workshop on Information Assurance and Security. New York, NY, 80--84.Google ScholarGoogle Scholar
  65. Hadi Fanaee-T and João Gama. 2016. Tensor-based anomaly detection: An interdisciplinary survey. Knowl-Based Syst. 98 (2016), 130--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Elaine R. Faria, Isabel J. C. R. Goncalves, A. C. P. L. F. de Carvalho, and J. Gama. 2015. Novelty detection in data streams. Artific. Intell. Rev. 45, 2 (2015), 235--269. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. E. W. Forgy. 1965. Cluster analysis of multivariate data: Efficiency versus interpretability of classifications. Biometrics 21 (1965), 768--780.Google ScholarGoogle Scholar
  68. A. Frank and A. Asuncion. 2018. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml/datasets.html.Google ScholarGoogle Scholar
  69. Jing Gao, Feng Liang, Wei Fan, Chi Wang, Yizhou Sun, and Jiawei Han. 2010. On community outliers and their efficient detection in information networks. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD’10). 813--822. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Pedro Garcia-Teodoro, J. Diaz-Verdejo, Gabriel Maciá-Fernández, and Enrique Vázquez. 2009. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Security 28, 1 (2009), 18--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Yong Ge, Hui Xiong, Zhi-Hua Zhou, Hasan Ozdemir, Jannite Yu, and K. C. Lee. 2010. TOP-EYE: Top-k evolving trajectory outlier detection. In Proceedings of the ACM Conference on Information and Knowledge Management, (CIKM’10). 1--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Dhiren Ghosh and Andrew Vogt. 2012. Outliers: An evaluation of methodologies. In Proceedings of the Joint Statistical Meetings. American Statistical Association, 3455--3460.Google ScholarGoogle Scholar
  73. A. Ghoting, M. E. Otey, and S. Parthasarathy. 2004. Loaded: Link-based outlier and anomaly detection in evolving data sets. In Proceedings of the IEEE International Conference on Data Mining (ICDM’04). 387--390. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Amol Ghoting, Srinivasan Parthasarathy, and Matthew Eric Otey. 2008. Fast mining of distance-based outliers in high dimensional datasets. Data Min. Knowl. Discov. J. 16(3) (2008), 349--364. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Prasanta Gogoi, D. K. Bhattacharyya, Bhogeswar Borah, and Jugal K. Kalita. 2011. A survey of outlier detection methods in network anomaly identification. Comput. J. 54, 4 (2011), 570--588. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Gene H. Golub and Charles F. van Loan. 2012. Matrix Computations, 3rd ed. John Hopkins University Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Geoffrey Grimmett and David Stirzaker. 2001. Probability and Random Processes, 3rd ed. Oxford University Press, Oxford, UK.Google ScholarGoogle Scholar
  78. V. Gunamani and M. Abarna. 2013. A survey on intrusion detection using outlier detection techniques. Int. J. Sci. Eng. Technol. Res. 2, 11 (2013), 2063 --2068.Google ScholarGoogle Scholar
  79. Manish Gupta, Jing Gao, Charu C. Aggarwal, and Jiawei Han. 2014. Outlier detection for temporal data. Synth. Lect. Data Min. Knowl. Discov. 5, 1 (2014), 1--129.Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Manish Gupta, Jing Gao, Charu C. Aggarwal, and Jiawei Han. 2014. Outlier detection for temporal data: A survey. IEEE Trans. Knowl. Data Eng. 26, 9 (2014), 2250--2267.Google ScholarGoogle ScholarCross RefCross Ref
  81. Ali S. Hadi. 1992. Identifying multiple outliers in multivariate data. J. Roy. Stat. Soc., Ser. B 54 (1992), 761--771.Google ScholarGoogle Scholar
  82. Ali S. Hadi. 1992. A new measure of overall potential influence in linear regression. Comput. Stat. Data Anal. 14 (1992), 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Ali S. Hadi. 1994. A modification of a method for the detection of outliers in multivariate samples. J. Roy. Stat. Soc., Ser. B 56 (1994), 393--396.Google ScholarGoogle Scholar
  84. Ali S. Hadi, A. H. M. Rahmatullah Imon, and Mark Werner. 2009. Detection of outliers. Wiley Interdisc. Rev.: Comput. Stat. 1 (2009), 57--70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Ali S. Hadi and J. S. Simonoff. 1993. Procedure for the identification of outliers in linear models. J. Amer. Stat. Assoc. 88 (1993), 1264--1272.Google ScholarGoogle ScholarCross RefCross Ref
  86. Xiaojuan Han, Yong Yan, Cheng Cheng, Yueyan Chen, and Yanglin Zhu. 2014. Monitoring of oxygen content in the flue gas at a coal-fired power plant using cloud modeling techniques. IEEE Trans. Instrument. Measure. 63, 4 (2014), 953--963.Google ScholarGoogle ScholarCross RefCross Ref
  87. Z. He, X. Xu, and S. Deng. 2005. An optimization model for outlier detection in categorical data. In Proceedings of the International Conference on Advances in Intelligent Computing. 400--409. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Z. He, X. Xu, and S. Deng. 2006. A fast greedy algorithm for outlier mining. In Proceedings of the Pacific Asia Knowledge Discovery and Data Mining (PAKDD’06). Singapore, 567--576. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Z. He, X. Xu, J. Z. Huang, and S. Deng. 2005. FP-outlier: Frequent pattern based outlier detection. Comput. Sci. Info. Syst. 2 (2005), 726--732.Google ScholarGoogle Scholar
  90. S. Hido, Y. Tsuboi, H. Kashima, M. Sugiyama, and T. Kanamori. 2011. Statistical outlier detection using direct density ratio estimation. Knowl. Info. Syst. 26, 2 (2011), 309--336.Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. V. J Hodge and J. Austin. 2004. A survey of outlier detection methodologies. Artific. Intell. Rev. 22 (2004), 85--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Zhexue Huang. 1997. A fast clustering algorithm to cluster very large categorical data sets in data mining. In Proceedings of the International Data Mining and Knowledge Discovery (DMKM’97), Workshop at the ACM International Conference on Mangagement of Data (SIGKDD). 1--8.Google ScholarGoogle Scholar
  93. Z. Huang and M. K. Ng. 1999. A fuzzy k-modes algorithm for clustering categoircal data. IEEE Trans. Fuzzy Syst. 7 (1999), 446--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Dino Ienco, Ruggero G. Pensa, and Rosa Meo. 2012. From context to distance: Learning dissimilarity for categorical data clustering. ACM Trans. Knowl. Discov. Data 6, 1 (2012), 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. Dino Ienco, Ruggero G. Pensa, and Rosa Meo. 2017. A semisupervised approach to the detection and characterization of outliers in categorical data. IEEE Trans. Neural Netw. Learn. 28, 5 (2017), 1017--1029.Google ScholarGoogle ScholarCross RefCross Ref
  96. Francesca Ieva and Anna Maria Paganoni. 2015. Detecting and visualizing outliers in provider profiling via funnel plots and mixed effect models. Health Care Manage. Sci. 18, 2 (2015), 166--172.Google ScholarGoogle Scholar
  97. ShengYi Jiang, Xiaoyu Song, Hui Wang, Jian-Jun Han, and Qing-Hua Li. 2006. A clustering-based method for unsupervised intrusion detections. Pattern Recogn. Lett. 27 (2006), 802--810. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Vineet Joshi and Raj Bhatnagar. 2014. CBOF: Cohesiveness-based outlier factor a novel definition of outlier-ness. In Proceedings of the International Workshop on Machine Learning and Data Mining in Pattern Recognition (MLDM’14). 175--189.Google ScholarGoogle ScholarCross RefCross Ref
  99. Hossein Joudaki, Arash Rashidian, Behrouz Minaei-Bidgoli, Mahmood Mahmoodi, Bijan Geraili, Mahdi Nasiri, and Mohammad Arab. 2015. Using data mining to detect health care fraud and abuse: A review of literature. Global J. Health Sci. 7, 1 (2015), 194--202.Google ScholarGoogle Scholar
  100. Leonid Kalinichenko, Ivan Shanin, and Ilia Taraban. 2014. Methods for anomaly detection: A survey. In Proceedings of the All-Russian Conference Digital Libraries: Advanced Methods and Technologies, Digital Collections (RCDL’14). 20--25.Google ScholarGoogle Scholar
  101. V. Kathiresan and N. A. Vasanthi. 2015. A survey on outlier detection techniques useful for financial card fraud detection. Int. J. Innovat. Eng. Technol. 6, 1 (2015), 226--235.Google ScholarGoogle Scholar
  102. Ravneet Kaur and Sarbjeet Singh. 2015. A survey of data mining and social network analysis based anomaly detection techniques. Egypt. Info. J. 39 (2015), 1--18.Google ScholarGoogle Scholar
  103. E. M. Knorr, R. T. Ng, and V. Tucakov. 2000. Distance-based outliers: Algorithms and applications. VLDB J. 8 (2000), 237--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. Edwin M. Knorr and Raymond T. Ng. 1997. A unified approach for mining outliers. In Proceedings of the International Conference of the Centre for Advanced Studies on Collaborative Research (CASCON’97). 236--248. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. A. Koufakou, M. Georgiopoulos, and G. Anagnostopoulos. 2008. Detecting outliers in high-dimensional datasets with mixed attributes. In Proceedings of the International Conference on Data Mining (DMIN’08).Google ScholarGoogle Scholar
  106. A. Koufakou, E. Ortiz, M. Georgiopoulos, G. Anagnostopoulos, and K. Reynolds. 2007. A scalable and efficient outlier detection strategy for categorical data. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence (ICTAI’07). 210--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Anna Koufakou, Jimmy Secretan, and Michael Georgiopoulos. 2011. Non-derivable itemsets for fast outlier detection in large high-dimensional categorical data. Knowl. Info. Syst. 29, 3 (2011), 697--725. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. Aleksandar Lazarevic, Levent Ertöz, Vipin Kumar, Aysel Ozgur, and Jaideep Srivastava. 2003. A comparative study of anomaly detection schemes in network intrusion detection. In Proceedings of the SIAM International Conference on Data Mining (SDM’03). 25--36.Google ScholarGoogle ScholarCross RefCross Ref
  109. Dajiang Lei, Liping Zhang, and Lisheng Zhang. 2013. Cloud model-based outlier detect algorithm for categorical data. Int. J. Database Theory Appl. 6, 14 (2013), 199--213.Google ScholarGoogle Scholar
  110. Deyi Li. 2000. Uncertainty in knowledge representation. Chinese Eng. Sci. 2, 10 (2000), 73--79.Google ScholarGoogle Scholar
  111. Jingchao Li and Jian Guo. 2015. A new feature extraction algorithm based on entropy cloud characteristics of communication signals. Math. Problems Eng. 2015 (2015), 1--8.Google ScholarGoogle Scholar
  112. Junli Li, Jifu Zhang, Ning Pang, and Xiao Qin. 2018. Weighted outlier detection of high-dimensional categorical data using feature grouping. IEEE Trans. Syst. Man Cybernet.: Syst. (2018), 1--14.Google ScholarGoogle Scholar
  113. Shuxin Li, Robert Lee, and Sheau-Dong Lang. 2007. Mining distance-based outliers from categorical data. In Proceedings of the IEEE International Conference on Data Mining Workshops (ICDM’07). 225--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. J. Y. Liang, K. S. Chin, and C. Y. Dang. 2002. A new method for measuring uncertainty and fuzziness in rough set theory. Int. J. Gen. Syst. 31 (2002), 331--342.Google ScholarGoogle ScholarCross RefCross Ref
  115. Song Lin and Donald E. Brown. 2006. An outlier-based data association method for linking criminal incidents. Decis. Support Syst. 41 (2006), 604--615. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Wei Liu, Yu Zheng, Sanjay Chawla, Jing Yuan, and Xing Xie. 2011. Discovering spatio-temporal causal interactions in traffic data streams. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD’11). 1010--1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. Xutong Liu, Feng Chen, and Chang-Tien Lu. 2014. On detecting spatial categorical outliers. GeoInformatica 18, 3 (2014), 501--536. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. Arunanshu Mahapatro and Pabitra Mohan Khilar. 2013. Fault diagnosis in wireless sensor networks: A survey. IEEE Commun. Surveys Tutor. 15, 4 (2013), 2000--2026.Google ScholarGoogle ScholarCross RefCross Ref
  119. Kamal Malik, H. Sadawarti, and G. S. Kalra. 2014. Comparative analysis of outlier detection techniques. Int. J. Comput. Appl. 97, 8 (2014), 12--21.Google ScholarGoogle Scholar
  120. Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze. 2008. Introduction to Information Retrieval. Cambridge University Press, Cambridge, UK. Google ScholarGoogle Scholar
  121. José Marinho, Jorge Granjal, and Edmundo Monteiro. 2015. A survey on security attacks and countermeasures with primary user detection in cognitive radio networks. EURASIP J. Info. Secur. 2015, 1 (2015), 1--14.Google ScholarGoogle ScholarCross RefCross Ref
  122. Markos Markou and Sameer Singh. 2003. Novelty detection: A review-part 1: Statistical approaches. Signal Process. 83 (2003), 2481--2497. Google ScholarGoogle ScholarDigital LibraryDigital Library
  123. Markos Markou and Sameer Singh. 2003. Novelty detection: A review-part 2: Neural network based approaches. Signal Process. 83 (2003), 2499--2521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  124. Manoj Mishra and Nitesh Gupta. 2015. To detect outlier for categorical data streaming. Int. J. Sci. Eng. Res. 6, 5 (2015), 1--5.Google ScholarGoogle Scholar
  125. Andrew Moore, Mary Soon Lee, and Brigham Anderson. 1998. Cached sufficient statistics for efficient machine learning with large datasets. J. Artific. Intell. Res. 8 (1998), 67--91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  126. Andrew Moore and W. K. Wong. 2003. Optimal reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning. In Proceedings of the 20th International Conference on Machine Learning. 552--559. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. Kazuyo Narita and Hiroyuki Kitagawa. 2008. Detecting outliers in categorical record databases based on attribute associations. In Progress in WWW Research and Development. Springer, Berlin, 111--123. Google ScholarGoogle ScholarDigital LibraryDigital Library
  128. K. Noto, C. Brodley, and D. Slonim. 2010. Anomaly detection using an ensemble of feature models. In Proceedings of the IEEE International Conference on Data Mining (ICDM’10). 953--958. Google ScholarGoogle ScholarDigital LibraryDigital Library
  129. K. Noto, C. Brodley, and D. Slonim. 2012. FRaC: A feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Min. Knowl. Discov. 25, 1 (2012), 109--133. Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. Colin O’Reilly, Alexander Gluhak, Muhammad Ali Imran, and Sutharshan Rajasegarar. 2014. Anomaly detection in wireless sensor networks in a non-stationary environment. IEEE Commun. Surveys Tutor. 16, 3 (2014), 1413--1432.Google ScholarGoogle ScholarCross RefCross Ref
  131. M. E. Otey, A. Ghoting, and S. Parthasarathy. 2006. Fast distributed outlier detection in mixed-attribute data sets. Data Min. Knowl. Discov. 12, 2--3 (May 2006), 203--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  132. Matthew Eric Otey, Srinivasan Parthasarathy, and Amol Ghoting. 2005. An empirical comparison of outlier detection algorithms. In Proceedings of the International Workshop on Data Mining Methods for Anomaly Detection at ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD’05). 1--8.Google ScholarGoogle Scholar
  133. Guansong Pang, Longbing Cao, and Ling Chen. 2016. Outlier detection in complex categorical data by modeling the feature value couplings. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. 1902--1908. Google ScholarGoogle ScholarDigital LibraryDigital Library
  134. Animesh Patcha and Jung-Min Park. 2007. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Comput. Netw. 51(12) (2007), 3448--3470. Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. M. S. Pawar, D. Amruta, and S. N. Tambe. 2014. A survey on outlier detection techniques for credit card fraud detection. IOSR J. Comput. Eng. 16, 2 (2014), 44--48.Google ScholarGoogle ScholarCross RefCross Ref
  136. Zdzisław Pawlak. 1982. Rough sets. Int. J. Comput. Info. Sci. 11, 5 (1982), 341--356.Google ScholarGoogle ScholarCross RefCross Ref
  137. C. Phua, D. Alahakoon, and V. Lee. 2004. Minority report in fraud detection: Classification of skewed data. ACM SIGKDD Explor. Newslett. 6, 1 (2004), 50--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. Clifton Phua, Vincent C. S. Lee, Kate Smith-Miles, and Ross W. Gayler. 2010. A comprehensive survey of data mining-based fraud detection research. Retrieved from http://arxiv.org/abs/1009.6119.Google ScholarGoogle Scholar
  139. Marco AF Pimentel, David A Clifton, Lei Clifton, and Lionel Tarassenko. 2014. A review of novelty detection. Signal Process. 99 (2014), 215--249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. Srijoni Saha Pradip, Jesica Fernandes Robert, and Jasmine Faujdar Hamza. 2015. Information-theoretic outlier detection for large-scale categorical data. Int. J. Comput. Sci. Mobile Comput. 4, 4 (2015), 873--881.Google ScholarGoogle Scholar
  141. Raghav M. Purankar and Pragati Patil. 2015. A survey paper on an effective analytical approaches for detecting outlier in continuous time variant data stream. Int. J. Eng. Comput. Sci. 4, 11 (2015), 14946--14949.Google ScholarGoogle Scholar
  142. Sridhar Ramaswamy, Rajeev Rastogi, and Kyuseok Shim. 2000. Efficient algorithms for mining outliers from large data sets. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’00). 427--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  143. Stephen Ranshous, Shitian Shen, Danai Koutra, Steve Harenberg, Christos Faloutsos, and Nagiza F. Samatova. 2015. Anomaly detection in dynamic networks: A survey. Wiley Interdisc. Rev.: Comput. Stat. 7, 3 (2015), 223--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  144. Lida Rashidi, Sattar Hashemi, and Ali Hamzeh. 2011. Anomaly detection in categorical datasets using Bayesian networks. In Proceedings of the 3rd International Conference on Artificial Intelligence and Computational Intelligence, Part II (AICI’11). 610--619. Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. Murad A. Rassam, M. A. Maarof, and Anazida Zainal. 2012. A survey of intrusion detection schemes in wireless sensor networks. Amer. J. Appl. Sci. 9, 10 (2012), 1636--1652.Google ScholarGoogle ScholarCross RefCross Ref
  146. Murad A. Rassam, Anazida Zainal, and Mohd Aizaini Maarof. 2013. Advancements of data anomaly detection research in wireless sensor networks: A survey and open issues. Sensors 13, 8 (2013), 10087--10122.Google ScholarGoogle ScholarCross RefCross Ref
  147. D. Lakshmi Sreenivasa Reddy, B. Raveendra Babu, and A. Govardhan. 2013. Outlier analysis of categorical data using navf. Informat. Econom. 17, 1 (2013), 1--5.Google ScholarGoogle Scholar
  148. Abdolazim Rezaei, Zarinah M. Kasirun, Vala Ali Rohani, and Touraj Khodadadi. 2013. Anomaly detection in online social networks using structure-based technique. In Proceedings of the International Conference for Internet Technology and Secured Transactions (ICITST’13). 619--622.Google ScholarGoogle Scholar
  149. Ritika, Tarun Kumar, and Amandeep Kaur. 2013. Outlier detection in WSN: A survey. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 3, 7 (2013), 609--617.Google ScholarGoogle Scholar
  150. N. Rokhman, Subanar, and E. Winarko. 2016. Improving the performance of outlier detection methods for Categorical data by using weighting function. J. Theor. Appl.d Info.n Technol. 83 (2016), 327--336.Google ScholarGoogle Scholar
  151. Peter J. Rousseeuw and Katrien Van Driessen. 1998. A fast algorithm for the minimum covariance determinant estimator. Technometrics 41 (1998), 212--223. Google ScholarGoogle ScholarDigital LibraryDigital Library
  152. Ashwini G. Sagade and Ritesh Thakur. 2014. Excess entropy based outlier detection in categorical data set. Int. J. Adv. Comput. Eng. Netw. 2, 8 (2014), 56--61.Google ScholarGoogle Scholar
  153. Aiman Moyaid Said, Dhanapal Durai Dominic, and Brahim Belhaouari Samir. 2013. Outlier detection scoring measurements based on frequent pattern technique. Res. J. Appl. Sci. Eng. Technol. 6, 8 (2013), 1340--134.Google ScholarGoogle ScholarCross RefCross Ref
  154. Arif Sari. 2015. A review of anomaly detection systems in cloud networks and survey of cloud security measures in cloud storage applications. J. Info. Secur. 6, 2 (2015), 142--154.Google ScholarGoogle ScholarCross RefCross Ref
  155. Debajit Sen Sarma and Samar Sen Sarma. 2015. A survey on different graph based anomaly detection techniques. Indian J. Sci. Technol. 8, 31 (2015), 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  156. David Savage, Xiuzhen Zhang, Xinghuo Yu, Pauline Chou, and Qingmai Wang. 2014. Anomaly detection in online social networks. Soc. Netw. 39 (2014), 62--70.Google ScholarGoogle ScholarCross RefCross Ref
  157. Bernhard Schölkopf, John C. Platt, John Shawe-Taylor, Alex J. Smola, and Robert C Williamson. 2001. Estimating the support of a high-dimensional distribution. Neural Comput. 13, 7 (2001), 1443--1471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  158. Junhee Seok and Yeong Seon Kang. 2015. Mutual information between discrete variables with many categories using recursive adaptive partitioning. Sci. Rep. 5 (2015), 1--10.Google ScholarGoogle Scholar
  159. Nauman Shahid, Ijaz Haider Naqvi, and Saad Bin Qaisar. 2015. Characteristics and classification of outlier detection techniques for wireless sensor networks in harsh environments: A survey. Artific. Intell. Rev. 43, 2 (2015), 193--228. Google ScholarGoogle ScholarDigital LibraryDigital Library
  160. Claude Elwood Shannon. 1948. A mathematical theory of communication. Bell Tele. Syst. Techn. Publ. 27, 3 (1948), 379--423.Google ScholarGoogle ScholarCross RefCross Ref
  161. Deep Shikha Shukla, Avinash Chandra Pandey, and Ankur Kulhari. 2014. Outlier detection: A survey on techniques of WSNs involving event and error based outliers. In Proceedings of the International Conference of Innovative Applications of Computational Intelligence on Power, Energy and Controls with their Impact on Humanity (CIPECH’14). 113--116.Google ScholarGoogle ScholarCross RefCross Ref
  162. M. Shyu, K. Sarinnapakorn, I. Kuruppu-Appuhamilage, S. Chen, L. W. Chang, and T. Goldring. 2005. Handling nominal features in anomaly intrusion detection problems. In Proceedings of the International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications. 55--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  163. Karanjit Singh and Shuchita Upadhyaya. 2012. Outlier detection: Applications and techniques. Int. J. Comput. Sci. Iss. 9, 1 (2012), 307--323.Google ScholarGoogle Scholar
  164. Koen Smets and Jilles Vreeken. 2011. The odd one out: Identifying and characterising anomalies. In Proceedings of the SIAM International Conference on Data Mining (SDM’11). 804--815.Google ScholarGoogle ScholarCross RefCross Ref
  165. Angela A. Sodemann, Matthew P. Ross, and Brett J. Borghetti. 2012. A review of anomaly detection in automated surveillance. IEEE Trans. Syst. Man Cybernet., Part C: Appl. Rev. 42, 6 (2012), 1257--1272. Google ScholarGoogle ScholarDigital LibraryDigital Library
  166. Garule Supriya and Sharmila M. Shinde. 2015. Outliers detection using subspace method: A survey. Int. J. Comput. Appl. 112, 16 (2015), 20--22.Google ScholarGoogle Scholar
  167. N. N. R. R. Suri, M. N. Murty, and G. Athithan. 2012. An algorithm for mining outliers in categorical data through ranking. In Proceedings of the 12th IEEE International Conference on Hybrid Intelligent Systems (HIS’12). 247--252.Google ScholarGoogle Scholar
  168. N. N. R. R. Suri, M. N. Murty, and G. Athithan. 2013. A rough clustering algorithm for mining outliers in categorical data. In Proceedings of the 4th International Conference on Pattern Recognition and Machine Intelligence (PReMI’13). 170--175.Google ScholarGoogle Scholar
  169. N. N. R. R. Suri, M. N. Murty, and G. Athithan. 2014. A ranking-based algorithm for detection of outliers in categorical data. Int. J. Hybrid Intell. Syst. 11 (2014), 1--11. Google ScholarGoogle ScholarDigital LibraryDigital Library
  170. N. N. R. R. Suri, M. N. Murty, and G. Athithan. 2016. Detecting outliers in categorical data through rough clustering. Nat. Comput. 15 (2016), 385--394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  171. Ayman Taha and Ali S. Hadi. 2013. A general approach for automating outliers identification in categorical data. In Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications (AICCSA’13). 1--8.Google ScholarGoogle Scholar
  172. Ayman Taha and Ali S. Hadi. 2016. Pair-wise association for categorical and mixed attributes. Info. Sci. 346 (2016), 73--89. Google ScholarGoogle ScholarDigital LibraryDigital Library
  173. Ayman Taha and Osman Hegazy. 2010. A proposed outliers identification algorithm for categorical data sets. In Proceedings of International Conference on Informatics and Systems (INFOS’10). 1--5.Google ScholarGoogle Scholar
  174. Yun Wang. 2008. Statistical Techniques for Network Security: Modern Statistically-Based Intrusion Detection and Protection. IGI Global, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  175. Yibo Wang and Wei Xu. 2018. Leveraging deep learning with LDA-based text analytics to detect automobile insurance fraud. Decis. Support Syst. 105 (2018), 87--95.Google ScholarGoogle ScholarCross RefCross Ref
  176. Li Wei, Weining Qian, Aoying Zhou, Wen Jin, and Jeffrey X. Yu. 2003. Hypergraph-based outlier test for categorical data. In Proceedings of the ACM International Conference on Knowledge Discovery and data Mining (SIGKDD’03). 399--410. Google ScholarGoogle ScholarDigital LibraryDigital Library
  177. David J. Weller-Fahy, Brett J. Borghetti, and Angela A. Sodemann. 2015. A survey of distance and similarity measures used within network intrusion anomaly detection. IEEE Commun. Surveys Tutor. 17, 1 (2015), 70--91.Google ScholarGoogle ScholarDigital LibraryDigital Library
  178. Jarrod West and Maumita Bhattacharya. 2016. Intelligent financial fraud detection: A comprehensive review. Comput. Secur. 57 (2016), 47--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  179. Shu Wu and Shengrui Wang. 2011. Parameter-free anomaly detection for categorical data. Machine Learning and Data Mining in Pattern Recognition. Lecture Notes in Computer Science 6871 (2011), 112--126. Google ScholarGoogle ScholarDigital LibraryDigital Library
  180. Shu Wu and Shengrui Wang. 2013. Information-theoretic outlier detection for large-scale categorical data. IEEE Trans. Knowl. Data Eng. 25, 3 (2013), 589--602. Google ScholarGoogle ScholarDigital LibraryDigital Library
  181. Warusia Yassin, Nur Izura Udzir, Zaiton Muda, and Nasir Sulaiman. 2013. Anomaly-based intrusion detection through k-means clustering and naives Bayes classification. In Proceedings of the International Conference on Computing and Informatics (ICOCI’13). 298--303.Google ScholarGoogle Scholar
  182. Jeffrey Xu Yu, Weining Qian, Hongjun Lu, and Aoying Zhou. 2006. Finding centric local outliers in categorical/numerical spaces. Knowl. Info. Syst. 9 (2006), 309--338.Google ScholarGoogle ScholarDigital LibraryDigital Library
  183. Rose Yu, Huida Qiu, Zhen Wen, Ching-Yung Lin, and Yan Liu. 2016. A survey on social media anomaly detection. Retrieevd from http://arxiv.org/pdf/1601.01102.Google ScholarGoogle Scholar
  184. Ji Zhang. 2013. Advancements of outlier detection: A survey. ICST Trans. Scal. Info. Syst. 13, 1 (2013), 1--26.Google ScholarGoogle Scholar
  185. Yang Zhang, Nirvana Meratnia, and Paul Havinga. 2010. Outlier detection techniques for wireless sensor networks: A survey. IEEE Commun. Surveys Tutor. 12, 2 (2010), 159--170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  186. Xingwang Zhao, Jiye Liang, and Fuyuan Cao. 2014. A simple and effective outlier detection algorithm for categorical data. Int. J. Mach. Learn. Cybernet. 5 (2014), 469--477.Google ScholarGoogle ScholarCross RefCross Ref
  187. Wobbe P. Zijlstra, L. Andries van der Ark, and Klaas Sijtsma. 2011. Outliers in questionnaire data: Can they be detected and should they be removed. J. Edu. Behav. Stat. 36 (2011), 186--212.Google ScholarGoogle ScholarCross RefCross Ref
  188. Arthur Zimek, Erich Schubert, and Hans-Peter Kriegel. 2012. A survey on unsupervised outlier detection in high-dimensional numerical data. Stat. Anal. Data Min. 5, 5 (2012), 363--387. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Anomaly Detection Methods for Categorical Data: A Review

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format