Abstract
We demonstrate our VSOutlier system for supporting interactive exploration of outliers in big data streams. VSOutlier not only supports a rich variety of outlier types supported by innovative and efficient outlier detection strategies, but also provides a rich set of interactive interfaces to explore outliers in real time. Using the stock transactions dataset from the US stock market and the moving objects dataset from MITRE, we demonstrate that the VSOutlier system enables analysts to more efficiently identify, understand, and respond to phenomena of interest in near real-time even when applied to high volume streams.
- F. Angiulli and F. Fassetti. Distance-based outlier queries in data streams: the novel task and algorithms. Data Min. Knowl. Discov., 20(2):290--324, 2010. Google ScholarDigital Library
- F. Angiulli and C. Pizzuti. Fast outlier detection in high dimensional spaces. In PKDD, pages 15--26, 2002. Google ScholarDigital Library
- A. Arasu, S. Babu, and J. Widom. The cql continuous query language. VLDB J., 15(2):121--142, 2006. Google ScholarDigital Library
- M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander. Lof: Identifying density-based local outliers. In SIGMOD Conference, pages 93--104, 2000. Google ScholarDigital Library
- L. Cao, D. Yang, Q. Wang, Y. Yu, J. Wang, and E. A. Rundensteiner. Scalable distance-based outlier detection over high-volume data streams. In ICDE, 2014.Google ScholarCross Ref
- J. Entzminger, J. N., C. Fowler, and W. Kenneally. Jointstars and gmti: past, present and future. Aerospace and Electronic Systems, IEEE Transactions on, 35(2):748--761, Apr. 1999.Google Scholar
- D. Georgiadis, M. Kontaki, A. Gounaris, A. N. Papadopoulos, K. Tsichlas, and Y. Manolopoulos. Continuous outlier detection in data streams: an extensible framework and state-of-the-art algorithms. In SIGMOD Conference, pages 1061--1064, 2013. Google ScholarDigital Library
- I. INETATS. Stock trade traces. http://www.inetats.com/.Google Scholar
- E. M. Knorr and R. T. Ng. Algorithms for mining distance-based outliers in large datasets. In VLDB, pages 392--403, 1998. Google ScholarDigital Library
- M. Kontaki, A. Gounaris, A. N. Papadopoulos, K. Tsichlas, and Y. Manolopoulos. Continuous monitoring of distance-based outliers over data streams. In ICDE, pages 135--146, 2011. Google ScholarDigital Library
- H.-P. Kriegel, M. Schubert, and A. Zimek. Angle-based outlier detection in high-dimensional data. In KDD, pages 444--452, 2008. Google ScholarDigital Library
- A. Nazaruk and M. Rauchman. Big data in capital markets. In SIGMOD Conference, pages 917--918, 2013. Google ScholarDigital Library
- S. Papadimitriou, H. Kitagawa, P. B. Gibbons, and C. Faloutsos. Loci: Fast outlier detection using the local correlation integral. In ICDE, pages 315--326, 2003.Google ScholarCross Ref
- S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In SIGMOD Conference, pages 427--438, 2000. Google ScholarDigital Library
- Z. Xie, S. Huang, M. O. Ward, and E. A. Rundensteiner. Exploratory visualization of multivariate data with variable quality. In IEEE VAST, pages 183--190, 2006.Google ScholarCross Ref
- D. Yang, E. Rundensteiner, and M. Ward. Neighbor-based pattern detection over streaming data. In EDBT, pages 529--540, 2009. Google ScholarDigital Library
Index Terms
- Interactive outlier exploration in big data streams
Recommendations
Distance-based outlier detection in data streams
Continuous outlier detection in data streams has important applications in fraud detection, network security, and public health. The arrival and departure of data objects in a streaming manner impose new challenges for outlier detection algorithms, ...
Online outlier detection for data streams
IDEAS '11: Proceedings of the 15th Symposium on International Database Engineering & ApplicationsOutlier detection is a well established area of statistics but most of the existing outlier detection techniques are designed for applications where the entire dataset is available for random access. A typical outlier detection technique constructs a ...
Comments