Skip to main content
Log in

Using Data Mining, Text Mining, and Bibliometric Techniques to the Research Trends and Gaps in the Field of Language and Linguistics

  • Published:
Journal of Psycholinguistic Research Aims and scope Submit manuscript

Abstract

This study adopted descriptive and explorative methods to analyze 2162 published documents, in general, and 1903 articles, in particular, in System from 1973 to 2020 based on the Scopus database. Data preprocessing and analysis were performed using data mining, text mining, and bibliometric techniques through Excel, VOSviewer, and RapidMiner software. To analyze the article titles and identify their themes, N-Grams was considered among the text mining techniques. From the data mining techniques, clustering was applied to explore the clusters of languages, educational technologies, technological spaces for foreign languages, etc. Bibliometric techniques such as co-authorship networks and citation analysis were in turn used to analyze the tops and trends of research in System. The results are classified into 5 categories including: (1) journal status; (2) publication trend; (3) articles with and without abstract/keyword; (4) highly-cited and uncited articles; (5) core and poor topics and keywords. The core topics are English as a Foreign Language, motivation, and second language acquisition. Among the languages, English, Chinese, and Japanese are at the top, and Italian, Danish, Persian, and Taiwanese are less discussed. Based on the findings, System has moved in line with its goals and scope, which are the applications of educational technology and applied linguistics to solve the problems of foreign language teaching and learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

The author confirms that all data generated or analysed during this study are included in this published article.

References

  • Amini Farsani, M., Jamali, H. R., Beikmohammadi, M., Daneshvar Ghorbani, B., & Soleimani, L. (2021). Methodological orientations, academic citations, and scientific collaboration in applied linguistics: What do research synthesis and bibliometrics indicate? System, 100, 102547. https://doi.org/10.1016/j.system.2021.102547

    Article  Google Scholar 

  • Ardanuy, J., Urbano, C., & Quintana, L. (2009). A citation analysis of Catalan literary studies (1974–2003): Towards a bibliometrics of humanities studies in minority languages. Scientometrics, 81(2), 347–366. https://doi.org/10.1007/s11192-008-2143-3

    Article  Google Scholar 

  • Arik, B., & Arik, E. (2017). Second language writing publications in Web of Science: A bibliometric analysis. Publications, 5(1), 4. https://doi.org/10.3390/publications5010004

    Article  Google Scholar 

  • Armstrong, L., Stansfield, J., & Bloch, S. (2017). Content analysis of the professional journal of the Royal college of speech and language therapists, iii: 1966–2015-into the 21st century. International Journal of Language & Communication Disorders, 52(6), 681–688. https://doi.org/10.1111/1460-6984.12313

    Article  Google Scholar 

  • Aryadoust, V. (2020). A review of comprehension subskills: A scientometrics perspective. System, 88, 102180. https://doi.org/10.1016/j.system.2019.102180

    Article  Google Scholar 

  • Aryadoust, V., & Ang, B. H. (2019). Exploring the frontiers of eye tracking research in language studies: A novel co-citation scientometric review. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2019.1647251

    Article  Google Scholar 

  • Barrot, J. S., Acomular, D. R., Alamodin, E. A., & Argonza, R. C. R. (2020). Scientific mapping of English language teaching research in the Philippines: A bibliometric review of doctoral and master’s theses (2010–2018). RELC Journal. https://doi.org/10.1177/0033688220936764

    Article  Google Scholar 

  • Bavanpouri, M., Gheibi, A., & Alilu, S. K. (2021). Study articles, in the Journal of studies in Arabic language and literature from a content analysis perspective. Ijaz Arabi Journal of Arabic Learning, 4(1), 89–101.

    Google Scholar 

  • Chander, H., & Singh, K. P. (2020). Frontline publishers of Punjabi language books: A bibliometric study. DESIDOC Journal of Library & Information Technology, 40(4), 230–237.

    Article  Google Scholar 

  • Chen, M. L. (2016). Development of corpus-based studies in second/foreign language acquisition and pedagogy from 1990 to 2015: A bibliometric analysis. English Teaching & Learning, 40(4), 1–38.

    Google Scholar 

  • Chen, X., Hao, J., Chen, J., Hua, S., & Hao, T. (2018). A bibliometric analysis of the research status of the technology enhanced language learning. In T. Hao, W. Chen, H. Xie, W. Nadee, & R. Lau (Eds.), Emerging technologies for education. SETE 2018. Lecture notes in computer science. (Vol. 11284). Cham: Springer. https://doi.org/10.1007/978-3-030-03580-8_18

    Chapter  Google Scholar 

  • Demir, Y., & Kartal, G. (2022). Mapping research on L2 pronunciation: A bibliometric analysis. Studies in Second Language Acquisition. https://doi.org/10.1017/S0272263121000966

    Article  Google Scholar 

  • Drakidou, C., Pareja-Lora, A., & Read, T. (2019). A bibliometric approach to the analysis of the Technologically-Enhanced Language Learning (TELL) literature. Argentinian Journal of Applied Linguistics (AJAL), 7(2), 8–33.

    Google Scholar 

  • Estevao, J. S. B., & Shima, W. (2017). The relevance of publications on the subject of innovation in the Portuguese language over the last 30 years: A bibliometric contribution. Qualitative and Quantitative Methods in Libraries, [S.l.]: 93–103.

  • Ezema, I. J., & Asogwa, B. E. (2014). Citation analysis and authorship patterns of two linguistics journals. Portal: Libraries and the Academy, 14(1), 67–85. https://doi.org/10.1353/pla.2013.0050

    Article  Google Scholar 

  • Gabel, J. (2006). Grammatical noun cases for non-linguists. Collection Management, 30(3), 87–111. https://doi.org/10.1300/j105v30n03_07

    Article  Google Scholar 

  • Gong, Y., Lyu, B., & Gao, X. (2018). Research on teaching Chinese as a second or foreign language in and outside Mainland China: A bibliometric analysis. The Asia-Pacific Education Researcher, 27(4), 277–289. https://doi.org/10.1007/s40299-018-0385-2

    Article  Google Scholar 

  • Hernández, J. B., Chalela, S., Arias, J. V., & Arias, A. V. (2016). Research trends in the study of ICT based learning communities: A bibliometric analysis. EURASIA Journal of Mathematics, Science and Technology Education, 13(5), 1539–1562. https://doi.org/10.12973/eurasia.2017.00684a

    Article  Google Scholar 

  • Kostoulas, A., & Mercer, S. (2016). Fifteen years of research on self and identity in System. System, 60, 128–134.

    Article  Google Scholar 

  • Kucuk, S., & Elif, K. (2018). Content analysis of the keywords in the dissertations on teaching Turkish as a foreign language. Journal of History Culture and Art Research, 7(5), 442–456. https://doi.org/10.7596/taksad.v7i5.1880

    Article  Google Scholar 

  • Lei, L., & Liao, S. (2017). Publications in linguistics journals from Mainland China, Hong Kong, Taiwan, and Macau (2003–2012): A bibliometric analysis. Journal of Quantitative Linguistics, 24(1), 54–64. https://doi.org/10.1080/09296174.2016.1260274

    Article  Google Scholar 

  • Lei, L., & Liu, D. (2019a). Research trends in applied linguistics from 2005 to 2016: A bibliometric analysis and its implications. Applied Linguistics, 40(3), 540–461. https://doi.org/10.1093/applin/amy003

    Article  Google Scholar 

  • Lei, L., & Liu, D. (2019b). The research trends and contributions of System’s publications over the past four decades (1973–2017): A bibliometric analysis. System, 80, 1–13. https://doi.org/10.1016/j.system.2018.10.003

    Article  Google Scholar 

  • Liao, S., & Lei, L. (2017). What we talk about when we talk about corpus: A bibliometric analysis of corpus-related research in linguistics (2000–2015). Glottometrics, 38, 1–20.

    Google Scholar 

  • Lin, Z., & Lei, L. (2020). The research trends of multilingualism in applied linguistics and education (2000–2019): A bibliometric analysis. Sustainability, 12(15), 6058. https://doi.org/10.3390/su12156058

    Article  Google Scholar 

  • Lopez-Martinez, R. E., & Sierra, G. (2020). Natural language processing, 2000–2019—A bibliometric study. Journal of Scientometric Research, 9(3), 310–318.

    Article  Google Scholar 

  • Luz Daza, M., Ma Perez, R., & Camargo, M. (2017). Speech and language pathology interventions in spasmodic dysphonia: A bibliometric study. Revista De Investigacion En Logopedia, 7(2), 203–221.

    Google Scholar 

  • Martín-Monje, E., Castrillo, M. D., & Mañana-Rodríguez, J. (2017). Understanding online interaction in language MOOCs through learning analytics. Computer Assisted Language Learning, 31(3), 251–272. https://doi.org/10.1080/09588221.2017.1378237

    Article  Google Scholar 

  • Mohsen, M. A. (2016). Contributions of Saudi institutions in applied linguistics’ journals indexed in SSCI: Perspectives from academics and journals’ editors. International Journal of Applied Linguistics and English Literature, 5(4), 102–109.

    Google Scholar 

  • Nederhof, A. J. (1996). A bibliometric assessment of research council grants in linguistics. Research Evaluation, 6(1), 2–12. https://doi.org/10.1093/rev/6.1.2

    Article  Google Scholar 

  • Radev, D. R., Joseph, M. T., Gibson, B., & Muthukrishnan, P. (2015). A bibliometric and network analysis of the field of computational linguistics. Journal of the Association for Information Science and Technology, 67(3), 683–706. https://doi.org/10.1002/asi.23394

    Article  Google Scholar 

  • Ravi, K., & Ravi, V. (2018). Irony detection using neural network language model, psycholinguistic features and text mining. In 2018 IEEE 17th international conference on cognitive informatics & cognitive computing (ICCI* CC) (pp. 254–260). IEEE.

  • Roessger, K. M. (2017). From theory to practice: A quantitative content analysis of adult education’s language on meaning making. Adult Education Quarterly, 67(3), 209–227.

    Article  Google Scholar 

  • Stickler, U., & Shi, L. (2016). TELL us about CALL: An introduction to the Virtual Special Issue (VSI) on the development of technology enhanced and computer assisted language learning published in the System Journal. System, 56, 119–126. https://doi.org/10.1016/j.system.2015.12.004

    Article  Google Scholar 

  • System. (2021). Aims and scope. Retrieved February 25, 2021, from https://www.sciencedirect.com/journal/system/about/aims-and-scope.

  • Uzunboylu, H., & Genc, Z. (2017). Analysis of documents published in Scopus database on foreign language learning through mobile learning: A content analysis. Profile Issues in Teachers Professional Development, 19, 99–107.

    Article  Google Scholar 

  • Van Doorslaer, L., & Gambier, Y. (2015). Measuring relationships in translation studies. On affiliations and keyword frequencies in the translation studies bibliography. Perspectives, 23(2), 305–319. https://doi.org/10.1080/0907676x.2015.1026360

    Article  Google Scholar 

  • Xin, S., Ping, W., & Qin, Y. (2021). Twenty years’ development of translanguaging: A bibliometric. International Journal of Multilingualism. https://doi.org/10.1080/14790718.2021.2007933

    Article  Google Scholar 

  • Yilmaz, R. M., Topu, F. B., & Takkaç Tulgar, A. (2019). An examination of the studies on foreign language teaching in pre-school education: A bibliometric mapping analysis. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2019.1681465

    Article  Google Scholar 

  • Zanettin, F., Saldanha, G., & Harding, S. A. (2015). Sketching landscapes in translation studies: A bibliographic study. Perspectives, 23(2), 161–182. https://doi.org/10.1080/0907676x.2015.1010551

    Article  Google Scholar 

  • Zhang, X. (2020). A bibliometric analysis of second language acquisition between 1997 and 2018. Studies in Second Language Acquisition, 42(1), 199–222. https://doi.org/10.1017/s0272263119000573

    Article  Google Scholar 

  • Zhang, R., Cheng, G., & Chen, X. (2020). Game-based self-regulated language learning: Theoretical analysis and bibliometrics. PLoS ONE, 15(12), e0243827.

    Article  PubMed  PubMed Central  Google Scholar 

  • Zhao, Y. (2003). Recent developments in technology and language learning: A literature review and meta-analysis. CALICO Journal, 21(1), 7–27.

    Article  Google Scholar 

  • Zhong, X. Y. (2018). A bibliometric analysis on the published papers in Journal of Technology Enhanced Foreign Language Education (TEFLE) from Year 2006 to 2015. In Z. Pan, A. Cheok, & W. Müller (Eds.), Transactions on edutainment XIV. Lecture notes in computer science. (Vol. 10790). Berlin: Springer. https://doi.org/10.1007/978-3-662-56689-3_16

    Chapter  Google Scholar 

  • Zhong, X., & Liu, H. (2022). A bibliometric analysis of the IRAL over the past six decades. International Review of Applied Linguistics in Language Teaching. https://doi.org/10.1515/iral-2022-0088

    Article  Google Scholar 

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehrdad CheshmehSohrabi.

Ethics declarations

Conflict of interests

The authors declare they have no financial interests.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

CheshmehSohrabi, M., Mashhadi, A. Using Data Mining, Text Mining, and Bibliometric Techniques to the Research Trends and Gaps in the Field of Language and Linguistics. J Psycholinguist Res 52, 607–630 (2023). https://doi.org/10.1007/s10936-022-09911-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10936-022-09911-6

Keywords

Navigation