skip to main content
10.1145/1814245.1814249acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Emerging topic detection on Twitter based on temporal and social terms evaluation

Published:25 July 2010Publication History

ABSTRACT

Twitter is a user-generated content system that allows its users to share short text messages, called tweets, for a variety of purposes, including daily conversations, URLs sharing and information news. Considering its world-wide distributed network of users of any age and social condition, it represents a low level news flashes portal that, in its impressive short response time, has the principal advantage.

In this paper we recognize this primary role of Twitter and we propose a novel topic detection technique that permits to retrieve in real-time the most emergent topics expressed by the community. First, we extract the contents (set of terms) of the tweets and model the term life cycle according to a novel aging theory intended to mine the emerging ones. A term can be defined as emerging if it frequently occurs in the specified time interval and it was relatively rare in the past. Moreover, considering that the importance of a content also depends on its source, we analyze the social relationships in the network with the well-known Page Rank algorithm in order to determine the authority of the users. Finally, we leverage a navigable topic graph which connects the emerging terms with other semantically related keywords, allowing the detection of the emerging topics, under user-specified time constraints. We provide different case studies which show the validity of the proposed approach.

References

  1. Trendistic. http://trendistic.com/.Google ScholarGoogle Scholar
  2. Tweet tabs. http://tweettabs.com/.Google ScholarGoogle Scholar
  3. Twitter API. http://apiwiki.twitter.com/.Google ScholarGoogle Scholar
  4. Twopular. http://twopular.com/.Google ScholarGoogle Scholar
  5. Where-what-when. http://where-what-when.husk.org/.Google ScholarGoogle Scholar
  6. S. Abrol and L. Khan. Twinner: understanding news queries with geo-content using twitter. In GIR '10: Proceedings of the 6th Workshop on Geographic Information Retrieval, pages 1--8, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Allan, editor. Topic detection and tracking: event-based information organization. Kluwer Academic Publishers, Norwell, MA, USA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Balabanovic and Y. Shoham. Fab: Content-based, collaborative recommendation. Communications of the ACM, 40:66--72, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. K. Bun, M. Ishizuka, and B. M. Ishizuka. Topic extraction from news archive using tf*pdf algorithm. In Proceedings of 3rd Int'l Conference on Web Informtion Systems Engineering (WISE 2002), IEEE Computer Soc, pages 73--82. WISE, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Cataldi, C. Schifanella, K. S. Candan, M. L. Sapino, and L. D. Caro. Cosena: a context-based search and navigation system. In MEDES, pages 218--225. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. C. Chen, Y.-T. Chen, Y. S. Sun, and M. C. Chen. Life cycle modeling of news events using aging theory. In ECML, pages 47--59, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Chen, W. Geyer, C. Dugan, M. Muller, and I. Guy. Make new friends, but keep the old: recommending people on social networking sites. In CHI '09: Proceedings of the 27th international conference on Human factors in computing systems, pages 201--210, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Chen, R. Nairn, L. Nelson, M. Bernstein, and E. Chi. Short and tweet: Experiments on recommending content from information. Atlanta, USA, 2009. ACM Press.Google ScholarGoogle Scholar
  14. L. Di Caro, K. S. Candan, and M. L. Sapino. Using tagflake for condensing navigable tag hierarchies from tag clouds. In KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1069--1072, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Favenza, M. Cataldi, M. L. Sapino, and A. Messina. Topic development based refinement of audio-segmented television news. In NLDB '08, pages 226--232, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. P. C. Fung, J. X. Yu, H. Liu, and P. S. Yu. Time-dependent event hierarchy construction. In KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 300--309, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. Goldberg, D. Nichols, B. M. Oki, and D. Terry. Using collaborative filtering to weave an information tapestry. Communications of the ACM, 35(12):61--70, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Suppl. 1):5228--5235, April 2004.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. Hassan, D. Radev, J. Cho, and A. Joshi. Content based recommendation and summarization in the blogosphere. International AAAI Conference on Weblogs and Social Media, 2009.Google ScholarGoogle Scholar
  20. Q. He, K. Chang, and E.-P. Lim. Using burstiness to improve clustering of topics in news streams. Data Mining, IEEE International Conference on, 0:493--498, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. Jäschke, L. Marinho, A. Hotho, L. Schmidt-Thieme, and G. Stumme. Tag recommendations in folksonomies. In PKDD 2007, pages 506--514, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. J. Leskovec and C. Faloutsos. Sampling from large graphs. In KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 631--636, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Makkonen, H. Ahonen-Myka, and M. Salmenkivi. Simple semantics in topic detection and tracking. Inf. Retr., 7(3--4):347--368, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. P. Melville, R. J. Mooney, and R. Nagarajan. Content-boosted collaborative filtering. In In Proceedings of the 2001 SIGIR Workshop on Recommender Systems, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. In Proceedings of the 7th International World Wide Web Conference, pages 161--172, Brisbane, Australia, 1998.Google ScholarGoogle Scholar
  26. Y. Qi and K. S. Candan. Cuts: Curvature-based development pattern analysis and segmentation for blogs and other text streams. In HYPERTEXT '06, pages 1--10, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. I. Ruthven and M. Lalmas. A survey on the use of relevance feedback for information access systems. Knowl. Eng. Rev., 18(2):95--145, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. In Information Processing and Management, pages 513--523, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. M. O. Takeshi Sakaki and Y. Matsuo. Earthquake shakes twitter users: Real-time event detection by social sensors. In WWW 2010, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. P. Treeratpituk and J. Callan. Automatically labeling hierarchical clusters. In dg.o '06: Proceedings of the 2006 international conference on Digital government research, pages 167--176, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Wang, M. Zhang, L. Ru, and S. Ma. Automatic online news topic ranking using media focus and user attention based on aging theory. In CIKM '08, pages 1033--1042, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Y. Wu, Y. Ding, X. Wang, and J. Xu. On-line hot topic recommendation using tolerance rough set based topic clustering. Journal of Computers, 5(4), 2010.Google ScholarGoogle ScholarCross RefCross Ref
  33. Q. Zhao, P. Mitra, and B. Chen. Temporal and information flow based event detection from social text streams. In AAAI'07: Proceedings of the 22nd national conference on Artificial intelligence, pages 1501--1506. AAAI Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Emerging topic detection on Twitter based on temporal and social terms evaluation

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            MDMKDD '10: Proceedings of the Tenth International Workshop on Multimedia Data Mining
            July 2010
            86 pages
            ISBN:9781450302203
            DOI:10.1145/1814245

            Copyright © 2010 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 25 July 2010

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate3of5submissions,60%

            Upcoming Conference

            KDD '24

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader