Using argumentation to extract key sentences from biomedical abstracts

https://doi.org/10.1016/j.ijmedinf.2006.05.002Get rights and content

Abstract

PROBLEM: key word assignment has been largely used in MEDLINE to provide an indicative “gist” of the content of articles and to help retrieving biomedical articles. Abstracts are also used for this purpose. However with usually more than 300 words, MEDLINE abstracts can still be regarded as long documents; therefore we design a system to select a unique key sentence. This key sentence must be indicative of the article's content and we assume that abstract's conclusions are good candidates. We design and assess the performance of an automatic key sentence selector, which classifies sentences into four argumentative moves: PURPOSE, METHODS, RESULTS and CONCLUSION. METHODS: we rely on Bayesian classifiers trained on automatically acquired data. Features representation, selection and weighting are reported and classification effectiveness is evaluated on the four classes using confusion matrices. We also explore the use of simple heuristics to take the position of sentences into account. Recall, precision and F-scores are computed for the CONCLUSION class. For the CONCLUSION class, the F-score reaches 84%. Automatic argumentative classification using Bayesian learners is feasible on MEDLINE abstracts and should help user navigation in such repositories.

Introduction

Systems for text mining are becoming increasingly important in biomedicine because of the exponential growth of knowledge. The mass of scientific literature needs to be filtered and categorized to provide for the most efficient use of the data. The problem of accessing this increasing volume of data demands the development of systems that can extract pertinent information from unstructured texts, hence the importance of key words extraction, as well as key sentence extraction. While the former task has been largely addressed in text categorization studies [1], the current status of the latter is the subject of this report. Defining what is a key sentence is a complex task because more than key words, key sentences are dependent on the domain and dependent on the point of view of the reader; however like for key words, which can comprehensively be provided through a controlled vocabulary, we believe it is possible to propose linguistically-motivated criteria to define key sentences in order, for example, to separate well-known and well-established background knowledge or methods from new or putative facts, usually reported in conclusion sections of articles. Applying key words mapping methods to extract the informative content is a well-known technique to navigate digital documents [2], [3] and to extract conceptual information [30], but sentences provide additional materials and therefore suggest original strategies. As stated in professional guidelines (ANSI/NISO Z39. 14-1979), articles in experimental sciences tend to respect strict argumentative patterns with at least four sections: PURPOSE–METHODS–RESULTS–CONCLUSION. These four moves – leaving aside minor variation of labels – are reported to be very stable across different scientific genres (chemistry, anthropology, computer sciences, linguistics…) [4], and are confirmed in biomedical abstracts and articles [5], [6], [30]. Following recent developments in information retrieval [7], [28], which show that conclusion are the most content-bearing sentences to perform related articles search and index pruning tasks in MEDLINE, we assume that conclusion sentences would be good candidates for such key sentences in scientific texts.

The remainder of the paper is organized as follows. Section 2 provides an overview of the state-of-the-art. Section 3 describes the data and methods used to develop our sentence categorizer. Section 4 evaluates our developments and Section 5 concludes on our experiments.

Section snippets

Background

Selecting argumentative contents is formally a classification task: for any input text, the system will have to decide which sentence is a conclusion and which is not. Abstracts are split into sentences using a set of manually crafted regular expressions [18]. Intuitively, sentences are natural candidate segments for argumentative classification [8], because they are semantically more self-containing than phrases. Although anaphoric phenomena may demand larger segments [9], it has been shown

Methods and data

In this section, we first describe the data used to train and test our argumentative classifier, and then we report on the construction of the categorizer.

Results

In this section, we report on the evaluation of our argumentative categorizer on sets B and C. The system is evaluated with and without using positional information. Results in Table 2 gives the confusion matrices between what was expected (columns) and the class predicted by the classifier (row): the diagonal (top left to bottom right) indicates the rate of well classified segments for each of the classes. An example of the output is given in Fig. 3. Confusion matrices help to identify

Conclusion

We have reported on the construction of a categorizer, which classifies sentences of biomedical abstracts into a four-class argumentative model. The system is based on a set of Bayesian learners trained on automatically acquired corpora and augmented with distributional heuristics. Feature weighting was optimal with DF-thresholding. For the CONCLUSION class, which has been reported to contain more highly informative contents than other sentences, we obtain an F-score of 85%. These results

Acknowledgments

The study has been supported by the EU-IST program (SemanticMining Grant 507505 -Swiss OFES Grant 03.0399) and by the Swiss National Foundation (EAGL, Grant No. 3252B0-105755). We would like to thank A. Gaudinat for the integration of the Argumentative classifier in the WRAPIN demonstrator (WRAPIN portal: http://www.wrapin.org/). We also would like to thank Frédérique Lisacek, who helps designing the evaluation data.

References (34)

  • F. Sebastiani

    Machine learning in automated text categorization

    ACM Comput. Surveys

    (2002)
  • A. Aronson et al.

    The NLM indexing initiative

    Proc. AMIA Symp.

    (2000)
  • P. Ruch et al.

    Learning-free text categorization

    AIME

    (2003)
  • C. Orasan

    Patterns in scientific abstracts

  • J. Swales

    Genre Analysis: English in Academic and Research Settings

    (1990)
  • F. Salanger-Meyer

    Discoursal movements in medical English abstracts and their linguistic exponents: a genre analysis study

    INTERFACE: J. Appl. Linguist.

    (1990)
  • I. Tbahriti, C. Chichester, F. Lisacek, P. Ruch, Using argumentation to retrieve articles with similar citations: an...
  • P. Ruch et al.

    Report on the TREC 2003 experiment: genomic track

    TREC

    (2003)
  • U. Hahn et al.

    Why discourse structures in medical reports matter for the validity of automatically generated text knowledge bases

    Medinfo

    (1998)
  • J. Kupiec, J. Pedersen, F. Chen. A Trainable Document Summarizer. SIGIR 1995,...
  • Y. Yang

    An evaluation of statistical approaches to text categorization

    J. Inf. Retrieval

    (1999)
  • S. Dumais et al.

    Inductive learning algorithms and representations for text categorization

    CIKM, ACM

    (1998)
  • K. Ming Adam Chai et al.

    Bayesian online classifiers for text classification and filtering

    SIGIR

    (2002)
  • D. Beeferman et al.

    Statistical models for text segmentation

  • D. Lewis et al.

    A comparison of two learning algorithms for text categorization

    SDAIR

    (1994)
  • P. Domingos et al.

    On the optimality of the simple Bayesian classifier under zero-one loss

    Machine Learn.

    (1997)
  • Y. Yang, J. Pedersen, A comparative study on feature selection in text categorization. Proceedings of 14th...
  • Cited by (80)

    • A review on method framework construction of Chinese Information Science

      2022, Data and Information Management
      Citation Excerpt :

      In the task of sentence classification based on abstract, the previous research generally uses two methods: the statistical learning method and the neural network method. The statistical learning method is based on the features of sentence content (Sakai & Hirokawa, 2012, December; Shimbo, Yamasaki, & Matsumoto, 2003; Chung, 2009; Hirohata, Okazaki, Ananiadou, & Ishizuka, 2008), position(Shimbo et al., 2003; Chung, 2009; Hirohata et al., 2008), part of speech(Chung, 2009), context content (Shimbo,2003; Chung, 2009; Hirohata et al., 2008), and uses Support Vector Machine (SVM) (Sakai & Hirokawa, 2012, December; Shimbo et al., 2003; McKnight and srinivasan, 2003; Yamamoto & Takagi, 2005, April), Conditional Random Field (CRF)(Chung, 2009; Hirohata et al., 2008), Naive Bayes (NB)(Ruch et al., 2007) and other models to realize sentence classification. The neural network method can automatically learn features from the text, including CNN (Wang et al., 2020), BiLSTM (Wang et al., 2020; Dernoncourt et al., 2016), BiGRU (Gonçalves et al., 2020), and attention mechanism.

    • Sectioning biomedical abstracts using pointer networks

      2023, ACM-BCB 2023 - 14th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
    View all citing articles on Scopus
    View full text