skip to main content
10.1145/872757.872764acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Distributed top-k monitoring

Published:09 June 2003Publication History

ABSTRACT

The querying and analysis of data streams has been a topic of much recent interest, motivated by applications from the fields of networking, web usage analysis, sensor instrumentation, telecommunications, and others. Many of these applications involve monitoring answers to continuous queries over data streams produced at physically distributed locations, and most previous approaches require streams to be transmitted to a single location for centralized processing. Unfortunately, the continual transmission of a large number of rapid data streams to a central location can be impractical or expensive. We study a useful class of queries that continuously report the k largest values obtained from distributed data streams ("top-k monitoring queries"), which are of particular interest because they can be used to reduce the overhead incurred while running other types of monitoring queries. We show that transmitting entire data streams is unnecessary to support these queries and present an alternative approach that reduces communication significantly. In our approach, arithmetic constraints are maintained at remote stream sources to ensure that the most recently provided top-k answer remains valid to within a user-specified error tolerance. Distributed communication is only necessary on occasion, when constraints are violated, and we show empirically through extensive simulation on real-world data that our approach reduces overall communication cost by an order of magnitude compared with alternatives that o er the same error guarantees.

References

  1. M. Arlitt and T. Jin. 1998 world cup web site access logs, Aug. 1988. Available at http://www.acm.org/sigcomm/ITA/.Google ScholarGoogle Scholar
  2. M. Arlitt and T. Jin. Workload characterization of the 1998 world cup web site. Technical Report HPL-1999-35R1, Hewlett Packard, Sept. 1999.Google ScholarGoogle Scholar
  3. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proc. PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. B. Babcock and C. Olston. Distributed top-k monitoring. Technical report, Stanford University Computer Science Department, 2002. http://dbpubs.stanford.edu/pub/2002-61.Google ScholarGoogle Scholar
  5. D. Barbara and H. Garcia-Molina. The Demarcation Protocol: A technique for maintaining linear arithmetic constraints in distributed database systems. In Proc. EDBT, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. A. Bernstein, B. T. Blaustein, and E. M. Clarke. Fast maintenance of semantic integrity assertions using redundant aggregate data. In Proc. VLDB, 1980.Google ScholarGoogle Scholar
  7. N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In Proc. ICDE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams - a new class of data management applications. In Proc. VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In Proc. Twenty-Ninth International Colloquium on Automata Languages and Programming, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In Proc. ACM SIGMOD, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Denial of service attacks using nameservers. Incident Note IN-2000-04, CMU Software Engineering Institute CERT Coordination Center, Apr. 2000. http://www.cert.org/incident_notes/IN-2000-04.html.Google ScholarGoogle Scholar
  12. M. Dilman and D. Raz. Efficient reactive monitoring. In Proc. INFOCOM, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Dwork, S. R. Kumar, M. Naor, and D. Sivakumar. Rank aggregation methods for the web. In Proc. WWW10, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Estrin, L. Girod, G. Pottie, and M. Srivastava. Instrumenting the world with wireless sensor networks. In Proc. International Conference on Acoustics, Speech, and Signal Processing, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  15. R. Fagin, S. R. Kumar, and D. Sivakumar. Comparing top k lists. In Proc. SODA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In Proc. PODS, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Floyd and V. Paxson. Wide-area Traffic: The Failure of Poisson Modeling. IEEE/ACM Transactions on Networking, 3(3), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. P. B. Gibbons and Y. Matias. New sampling-based summary statistics for improving approximate query answers. In Proc. ACM SIGMOD, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. B. Gibbons and S. Tirthapura. Estimating simple functions on the union of data streams. In Proc. Thirteenth ACM Symposium on Parallel Algorithms and Architectures, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. P. B. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In Proc. Fourteenth ACM Symposium on Parallel Algorithms and Architectures, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. Strauss. Surfing wavelets on streams: One-pass summaries for approximate aggregate queries. In Proc. VLDB, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Gupta and J. Widom. Local verification of global integrity constraints in distributed databases. In Proc. ACM SIGMOD, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Householder, A. Manion, L. Pesante, and G. Weaver. Managing the threat of denial-of-service attacks. Technical report, CMU Software Engineering Institute CERT Coordination Center, Oct. 2001. http://www.cert.org/archive/pdf/Managing_DoS.pdf.Google ScholarGoogle Scholar
  24. N. Huyn. Maintaining global integrity constraints in distributed databases. Constraints, 2(3/4), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. D. Li, K. Wong, Y. H. Hu, and A. Sayeed. Detection, classification and tracking of targets. IEEE Signal Processing Magazine, 2002.Google ScholarGoogle Scholar
  26. S. Madden, M. Shah, J. M. Hellerstein, and V. Raman. Continuously adaptive continuous queries over streams. In Proc. ACM SIGMOD, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. S. Manku and R. Motwani. Approximate frequency counts over data streams. In Proc. VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. R. Min, M. Bhardwaj, S. Cho, A. Sinha, E. Shih, A. Wang, and A. Chandrakasan. Low-power wireless sensor networks. In Proc. Fourteenth International Conference on VLSI Design, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, resource management, and approximation in a data stream management system. In Proc. CIDR, 2003.Google ScholarGoogle Scholar
  30. C. Olston, J. Jiang, and J. Widom. Adaptive filters for continuous queries over distributed data streams. In Proc. ACM SIGMOD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. T. Palpanas, R. Sidle, R. Cochrane, and H. Pirahesh. Incremental maintenance for non-distributive aggregate functions. In Proc. VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. G. Pottie and W. Kaiser. Wireless integrated network sensors. Communications of the ACM, 43(5), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. N. Soparkar and A. Silberschatz. Data-value partitioning and virtual messages. In Proc. PODS, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. T. Yamashita. Dynamic replica control based on fairly assigned variation of data with weak consistency for loosely coupled distributed systems. In Proc. ICDCS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. K. Yi, H. Yu, J. Yang, G. Xia, and Y. Chen. Efficient maintenance of materialized top-k views. In Proc. ICDE, 2003.Google ScholarGoogle Scholar

Index Terms

  1. Distributed top-k monitoring

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SIGMOD '03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data
            June 2003
            702 pages
            ISBN:158113634X
            DOI:10.1145/872757

            Copyright © 2003 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 9 June 2003

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • Article

            Acceptance Rates

            SIGMOD '03 Paper Acceptance Rate53of342submissions,15%Overall Acceptance Rate785of4,003submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader