A corpus-based EAP course for NNS doctoral students: Moving from available specialized corpora to self-compiled corpora

https://doi.org/10.1016/j.esp.2005.02.010Get rights and content

Abstract

This paper presents a discussion of an experimental, innovative course in corpus-informed EAP for doctoral students. Participants were given access to specialized corpora of academic writing and speaking, instructed in the tools of the trade (web- and PC-based concordancers) and gradually inducted into the skills needed to best exploit the data and the tools for directed learning as well as self-learning. After the induction period, participants began to compile two additional written corpora: one of their own writing (term papers, dissertation drafts, unedited journal drafts) and one of ‘expert’ writing, culled from electronic versions of published papers in their own field or subfield. Students were thus able to make comparisons between their own writing and those of more established writers in their field. At the end of the course, participants presented reports of their discoveries with some discussion of how they felt their rhetorical consciousness was raised and reflected on what further use they might be making of corpus linguistics techniques in their future careers. This paper gives an overview of how this course was structured, presents the kinds of discoursal and other linguistic phenomena examined and the sometimes surprising observations made, and reports on the pluses and minuses of this corpus-informed course as a whole, seen from the point of view of both learners and instructors.

Introduction

In many ways, the position of corpus linguistics, as a powerful methodology-technology, is well-established. It has been consolidating its position in lexicography (e.g., the CoBuild dictionaries), in grammars of languages (e.g., Biber, Johansson, Leech, Conrad, & Finegan, 1999, the CoBuild grammar series represented by Francis et al., 1996, Francis et al., 1998, Hunston and Francis, 2000), and in diachronic studies (e.g. Hickey et al., 1997, Kytö et al., 1994, Nevalainen and Kahlas-Tarkka, 1998, Rudanko, 2000). Further, the value of corpus-based approaches for language specialists such as language majors, language teachers and translators is becoming increasingly recognized (Bowker and Pearson, 2002, Cheng et al., 2003, Granger et al., 2002, Hunston, 2002, Kennedy, 1998, Partington, 1998, Sinclair, 2004). Mair (2002) has also interestingly argued that the availability of appropriate corpora has been of great benefit to foreign language lecturers, such as Anglicists in Germany, because they are no longer beholden to the whims of native-speakers for judgments of acceptability and correctness.

Corpus linguistics has also established itself as an important tool for determining the linguistic features of registers and genres (Biber, 1988, Lee, 2000) and, in EAP, for elucidating and comparing how different disciplines use language in their major genres (Bernardini, 2002, Flowerdew, 1998, Gavioli, 2002, Ghadessy et al., 2001, Hyland, 2000, Hyland, 2003, Luzon Marco, 2000, Thompson, 1998). However, it is less clear how – and when – these research findings can best be carried over into effective pedagogical practice. There are a number of issues here. One resides in the comparative merits of static, pre-prepared on-paper teaching products versus more spontaneous on-line work in front of a computer. A textbook such as Thurstun and Candlin (1997) shows what can be done with the former approach, but in their companion article (Thurstun & Candlin, 1998), they do acknowledge the danger of concordancing burnout: “Over-exposure to concordance lines can conceivably tire students if teaching of this type depends solely on deduction from concordance lines” (p. 278). This in turn raises a second issue: Should concordance work be supplementary to more traditional EAP instruction, or can it be made central with traditional instructor explication and exemplification reduced to the periphery?

A third factor to be considered is the level of disciplinary acculturation of the target participants. If those participants are just beginning to attempt the professional genres of their chosen specialization, then they probably need considerable help at the macro or structural level (such as with audience analysis and/or with the organization of the paper). This in turn suggests that a wholesale commitment to a corpus-based approach may not be fully effective, since concordancing tends to work better at the lexico-grammatical and phraseological levels rather than the structural level. On the other hand, if they already possess the appropriate genre knowledge, as is often the case with non-native speaker of English students completing their doctoral degrees, then what they may be mostly missing is fine-tuning of lexical and syntactic subtleties, particularly in terms of their strategic and rhetorical implications.

A fourth and final issue that we raise at this time concerns the degree of specialization in the corpora to be made available or to be constructed. What might be the merits and uses of a large, if sub-dividable, corpus such as the British National Corpus (BNC), as opposed to a collection of scholarly writing, such as Hyland’s corpus of 240 research articles drawn from eight fields (Hyland, 2001), as further opposed to highly tailored corpora that reflect pretty exactly participants’ target discourses? And where in this mix might we best place an academic spoken corpus? Should this be used for sustained or only occasional cross-register comparisons?

It is these kinds of questions that we attempted to grapple with in an experimental course for doctoral students at the English Language Institute (ELI) of the University of Michigan in the US. At this institution, as often elsewhere, successful advanced EAP (English for Academic Purposes) courses tend to focus on rhetorical consciousness-raising, often via comparison and discussion of different versions of the same discourse, or of comparable discourses, often from different disciplines (Belcher, 2004, Casanave, 2003, Swales and Feak, 2000, Thompson and Tribble, 2001). With the various types of corpora and concordancers now available, it seems that such an approach might be utilized with even greater facility, accuracy, and comprehensiveness if the right kinds of learners had access to the right texts and the right kind of corporist tools in the right kind of environment. This paper therefore outlines and analyses an experimental course in corpus-informed EAP. The course was deliberately entitled “Exploring your own discourse world” – for reasons that will become apparent – and had a primary focus on advanced academic writing, with a secondary one on academic speaking. The great majority of the sessions took place in a computer lab, where Wordsmith Tools (Scott, 1996) and various corpora were installed, and where the instructors’ monitors could be projected on a screen for all to see.

In the next section, we present an overview of the course, illustrating and discussing the kinds of discoursal and other linguistic phenomena examined and the sometimes surprising observations made. Here we place some emphasis on what worked and what worked less well. We then move into a consideration of the results of the corpus studies conducted by our students based on corpora they had themselves compiled. We conclude with overall reflections on this corpus-informed course, seen from the point of view of both EAP learners and instructors, as well as viewed from the perspective of the four issues outlined earlier.

Section snippets

Outline of the course

This 13-week course was conceived of as having three main components:

  • Weekly 2-h laboratory sessions (hands-on concordancing and EAP lessons) for the first 10 weeks.

  • Optional weekly individual consultations (miscellaneous written work handed in for comments and corrections, plus assistance with corpus compilation), as well as (Weeks 11 and 12) help with the preparation of the final projects. The consultations also proved useful by giving us clues about specific areas of weakness that the use of

Selected details from the computer classroom

Due to the fact that this was an experimental course, both instructors agreed that it would be best to have a ‘dynamic syllabus’ (also known – disparagingly – as ‘making up the syllabus as you go along’, or – more creatively – as ‘just in time materials’). There was inevitably a fair amount of trial-and-error, and some activities worked better than others. The whole enterprise was also marked by the kind of incidentalism that Swales (2002) has criticized as being characteristic of corpus

The student presentations

As mentioned earlier, students were told at the beginning that they would have to give a presentation on some corpus-based findings. Work towards their presentations was, however, generally concentrated toward the end of the course, when their self-compiled ESP corpora were ready for analysis. In the end, there were three presentations, one shared and two individual. Only one of the three presentations was based on a comparison of the two kinds of corpora they compiled for themselves (published

Discussion

In a recent survey of computer use in EFL classrooms in UK institutions of higher education, Jarvis (2004) found that concordancing was being used, on average, in about one in ten of the ESP/EAP courses. This use, according to his returned questionnaires, was considerably less than that reported for web-based language learning materials, e-mail writing tasks, word-processor tasks, and even the use of PowerPoint. Although Jarvis did not solicit information about how much of course time might be

David Lee is an Assistant Professor at the Nagoya University of Commerce and Business. He was formerly a post-doctoral research fellow at the English Language Institute (ELI), University of Michigan, where he worked on the MICASE (Michigan Corpus of Academic Spoken English) project, conducting research on the corpus and enhancing and promoting its usefulness for language teachers in materials development and language testing. His doctoral study conducted at Lancaster University (UK), Modelling

References (63)

  • S. Bernardini

    Systematising serendipity: Proposals for concordancing large corpora with language learners

  • S. Bernardini

    Exploring new directions for discovery learning

  • D. Biber

    Variation across speech and writing

    (1988)
  • D. Biber et al.

    Corpus linguistics: Investigating language structure and use

    (1998)
  • L. Bowker et al.

    Working with specialized language: A practical guide to using corpora

    (2002)
  • C.P. Casanave

    Multiple uses of applied linguistics in a multi-disciplinary graduate EAP class

    ELT Journal

    (2003)
  • D. Coniam

    Concordancing oneself: Constructing individual textual profiles

    International Journal of Corpus Linguistics

    (2004)
  • S. Conrad

    Will corpus linguistics revolutionize grammar teaching in the 21st century?

    TESOL Quarterly

    (2000)
  • S. Fligelstone

    Some reflections on the question of teaching, from a corpus linguistics perspective

    ICAME Journal

    (1993)
  • L. Flowerdew

    Concordancing on an expert and learner corpus for ESP

    CALL Journal

    (1998)
  • G. Francis et al.

    Collins COBUILD Grammar Patterns 1: Verbs

    (1996)
  • G. Francis et al.

    Collins COBUILD Grammar Patterns 2: Nouns and adjectives

    (1998)
  • Fu, Y. & Hong, T. (2004). Is the definite article disappearing from medical research papers? In: Paper presented at a...
  • L. Gavioli

    Some thoughts on the problem of representing ESP through small corpora

  • P.Y. Gu

    Fine brush and freehand: The vocabulary-learning art of two successful Chinese EFL learners

    TESOL Quarterly

    (2003)
  • Hickey, R., Kytö, M., Lancashire, I. & Rissanen, M. (Eds.), (1997). Tracing the trail of time. In Proceedings from the...
  • S. Hunston

    Corpora in applied linguistics

    (2002)
  • S. Hunston et al.

    Pattern grammar: A corpus-driven approach to the lexical grammar of English

    (2000)
  • Cited by (277)

    View all citing articles on Scopus

    David Lee is an Assistant Professor at the Nagoya University of Commerce and Business. He was formerly a post-doctoral research fellow at the English Language Institute (ELI), University of Michigan, where he worked on the MICASE (Michigan Corpus of Academic Spoken English) project, conducting research on the corpus and enhancing and promoting its usefulness for language teachers in materials development and language testing. His doctoral study conducted at Lancaster University (UK), Modelling Variation in Spoken And Written English: the Multi-Dimensional Approach Revisited, is scheduled to be published by Routledge in 2005.

    John M. Swales is Professor of Linguistics at the University of Michigan, where he was also Director of the English Language Institute from 1985 to 2001. His latest book-length publications are the second edition of “Academic Writing for Graduate Students” (with Christine Feak) (University of Michigan Press, 2004), and “Research Genres: Explorations and Applications (CUP, 2004). In 2004, he was awarded an honorary Ph.D. from the University of Uppsala, Sweden.

    View full text