skip to main content
10.1145/1143844.1143907acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Data association for topic intensity tracking

Published:25 June 2006Publication History

ABSTRACT

We present a unified model of what was traditionally viewed as two separate tasks: data association and intensity tracking of multiple topics over time. In the data association part, the task is to assign a topic (a class) to each data point, and the intensity tracking part models the bursts and changes in intensities of topics over time. Our approach to this problem combines an extension of Factorial Hidden Markov models for topic intensity tracking with exponential order statistics for implicit data association. Experiments on text and email datasets show that the interplay of classification and topic intensity tracking improves the accuracy of both classification and intensity tracking. Even a little noise in topic assignments can mislead the traditional algorithms. However, our approach detects correct topic intensities even with 30% topic noise.

References

  1. Aizen, J., Huttenlocher, D., Kleinberg, J., & Novak, A. (2004). Traffic-based feedback on the web. Proc. Natl. Acad. Sci., 101, 5254--5260.Google ScholarGoogle ScholarCross RefCross Ref
  2. Allan, J., Papka, R., & Lavrenko, V. (1998). On-line new event detection and tracking. SIGIR '98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Blei, D., & Lafferty, J. (2005). Correlated topic models. NIPS '05.Google ScholarGoogle Scholar
  4. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. JMLR. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. J. of the Am. Soc. of Inf. Sci., 41.Google ScholarGoogle Scholar
  6. Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ghahramani, Z., & Jordan, M. I. (1995). Factorial hidden Markov models. NIPS '95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kleinberg, J. (2003). Bursty and hierarchical structure in streams. KDD '03. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Krause, A., Leskovec, J., & Guestrin, C. (2006). Data association for topic intensity tracking (Technical Report CMU-ML-06-100). Carnegie Mellon University.Google ScholarGoogle Scholar
  10. Lerner, U. (2002). Hybrid bayesian networks for reasoning about complex systems. Ph.d. thesis, Stanford University.Google ScholarGoogle Scholar
  11. Lerner, U., & Parr, R. (2001). Inference in hybrid networks: Theoretical limits and practical algorithms. UAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ng, B., Pfeffer, A., & Dearden, R. (2005). Continuous time particle filtering. IJCAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nodelman, U., Shelton, C., & Koller, D. (2003). Learning continuous time bayesian networks. UAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Segal, R. B., & Kephart, J. O. (1999). Mailcat: an intelligent assistant for organizing e-mail. AGENTS '99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Swan, R., & Allan, J. (2000). Automatic generation of overview timelines. SIGIR '00. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Trivedi, K. (2002). Probability and statistics with reliability, queuing, and computer science applications. Prentice Hall. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yang, Y., Ault, T., Pierce, T., & Lattimer, C. W. (2000). Improving text categorization methods for event tracking. SIGIR '00. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Data association for topic intensity tracking

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ICML '06: Proceedings of the 23rd international conference on Machine learning
              June 2006
              1154 pages
              ISBN:1595933832
              DOI:10.1145/1143844

              Copyright © 2006 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 June 2006

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              ICML '06 Paper Acceptance Rate140of548submissions,26%Overall Acceptance Rate140of548submissions,26%

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader