skip to main content
10.1145/2505515.2505613acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Concept-based analysis of scientific literature

Authors Info & Claims
Published:27 October 2013Publication History

ABSTRACT

This paper studies the importance of identifying and categorizing scientific concepts as a way to achieve a deeper understanding of the research literature of a scientific community. To reach this goal, we propose an unsupervised bootstrapping algorithm for identifying and categorizing mentions of concepts. We then propose a new clustering algorithm that uses citations' context as a way to cluster the extracted mentions into coherent concepts. Our evaluation of the algorithms against gold standards shows significant improvement over state-of-the-art results. More importantly, we analyze the computational linguistic literature using the proposed algorithms and show four different ways to summarize and understand the research community which are difficult to obtain using existing techniques.

References

  1. D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. JMLR, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. M. Blei and J. D. Lafferty. Dynamic topic models. In ICML, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Collins and Y. Singer. Unsupervised models for named entity classification. In EMNLP, 1999.Google ScholarGoogle Scholar
  4. T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc Natl Acad Sci U S A, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. Gupta and C. D. Manning. Analyzing the dynamics of research by extracting key aspects of scientific papers. In IJCNLP, 2011.Google ScholarGoogle Scholar
  6. R. Huang and E. Riloff. Inducing domain-specific semantic class taggers from (almost) nothing. In ACL, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Meilă. Comparing clusterings by the variation of information. Learning theory and kernel machines, 2003.Google ScholarGoogle Scholar
  8. C. Niu, W. Li, J. Ding, and R. K. Srihari. A bootstrapping approach to named entity classification using successive learners. In ACL, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. X.-H. Phan and C.-T. Nguyen. Gibbslda: A C/C implementation of latent dirichlet allocation (LDA). 2007.Google ScholarGoogle Scholar
  10. V. Punyakanok and D. Roth. The use of classifiers in sequential inference. In NIPS, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Radev and A. Abu-Jbara. Rediscovering acl discoveries through the lens of acl anthology network citing sentences. In ACL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. R. Radev, M. T. Joseph, B. Gibson, and P. Muthukrishnan. A bibliometric and network analysis of the field of computational linguistics. JASIST, 2009.Google ScholarGoogle Scholar
  13. D. R. Radev, P. Muthukrishnan, and V. Qazvinian. The ACL anthology network corpus. In Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Riloff and J. Shepherd. A corpus-based approach for building semantic lexicons. In EMNLP, 1997.Google ScholarGoogle Scholar
  15. B. Roark and E. Charniak. Noun-phrase co-occurrence statistics for semiautomatic semantic lexicon construction. In ACL, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Sim, N. A. Smith, and D. A. Smith. Discovering factions in the computational linguistics community. In ACL, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Thelen and E. Riloff. A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In EMNLP, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Wang, A. McCallum, and X. Wei. Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In ICDM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Yangarber, W. Lin, and R. Grishman. Unsupervised learning of generalized names. In ACL, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. D. Yarowsky. Unsupervised word sense disambiguation rivaling supervied methods. In ACL, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Concept-based analysis of scientific literature

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
      October 2013
      2612 pages
      ISBN:9781450322638
      DOI:10.1145/2505515

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader