skip to main content
10.1145/1316902.1316919acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Extracting the discussion structure in comments on news-articles

Published:09 November 2007Publication History

ABSTRACT

Several on-line daily newspapers offer readers the opportunity to directly comment on articles. In the Netherlands this feature is used quite often and the quality (grammatically and content-wise) is surprisingly high. We develop techniques to collect, store, enrichand analyze these comments. After giving a high-level overview of the Dutch 'commentosphere' we zoom in on extracting the discussion structure found in flat comment threads; people not only comment on the news article, they also heavily comment on other comments, resembling discussion fora. We show how techniques from information retrieval, natural language processing and machine learning can be used to extract the 'reacts-on' relation between comments with high precision and recall.

References

  1. R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Balog, G. Mishne, and M. de Rijke. Why are they excited? identifying and explaining spikes in blog mood levels. In Proceedings 11th Meeting of the European Chapter of the Association for Computational Linguistics (EACL 2006), April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. de Moor and L. Efimova. An argumentation analysis of weblog conversations. In The 9th International Working Conference on the Language-Action Perspective on Communication Modelling (LAP 2004), 2004.Google ScholarGoogle Scholar
  4. X. Dong, A. Halevy, and J. Madhavan. Reference reconciliation in complex information spaces. In Proc. SIGMOD, pages 85--96, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Gumbrecht. Blogs as protected space. In WWW 2004 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, 2004.Google ScholarGoogle Scholar
  6. G. Mishne. Applied Text Analytics for Blogs. PhD thesis, University of Amsterdam, 2007.Google ScholarGoogle Scholar
  7. J. Quinlan. C4. 5: Programs for Machine Learning. Morgan Kaufmann, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Ratcliff and D. Metzener. Pattern matching: The Gestalt approach. Dr. Dobb's Journal, page 46, 1988.Google ScholarGoogle Scholar
  9. A. Schuth. Applied text analytics for comments of news articles, 2007.Google ScholarGoogle Scholar
  10. E. Tjong Kim Sang. Generating subtitles from linguistically annotated text. Atranos report WP4-12, University of Antwerp, 2003.Google ScholarGoogle Scholar
  11. E. Trevino. Blogger motivations: Power, pull, and positive feedback. In Internet Research 6.0, 2005.Google ScholarGoogle Scholar
  12. T. Witschge. (In)difference Online. PhD thesis, ASCoR, Universiteit van Amsterdam, 2007.Google ScholarGoogle Scholar
  13. I. Witten and E. Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Extracting the discussion structure in comments on news-articles

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WIDM '07: Proceedings of the 9th annual ACM international workshop on Web information and data management
      November 2007
      168 pages
      ISBN:9781595938299
      DOI:10.1145/1316902

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 November 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader