Published in association
with the JALT VOCAB SIG
About this Journal
Information for Authors
Related Publications
Online Language Teaching: Crises and Creativities
Insights into Teaching and Learning Writing
Insights into Autonomy and Technology in Language Teaching
Insights into Flipped Classrooms
Insights into Task-Based Language Teaching
Proceedings of the XXIst International CALL Research Conference
Insights into Professional Development in Language Teaching
Smart CALL: Personalization, Contextualization, & Socialization

Developing a discipline-specific corpus and high-frequency word list for science and engineering students in graduate school
Suwako Uehara, Hibiya Haraki, Stuart McLean
– Japanese graduate school students in the field of science and engineering need to read academic research in their second language (L2), and such tasks can be challenging.
Author(s) | |
---|---|
Paper type | Regular Article |
Pages | 57-68 |
DOI | |
Year |
Abstract
Japanese graduate school students in the field of science and engineering need to read academic research in their second language (L2), and such tasks can be challenging. Studies showed a strong (0.78) correlation between vocabulary size and reading comprehension (McLean et al., 2020), and providing high-frequency word lists could enhance comprehension. In this work-in-progress, 1.35 million tokens of professor-recommended reading materials were used to investigate a method to create a vocabulary list that would benefit science majors in graduate school, the procedures to create a corpus and a high-frequency word list efficiently, and the steps required to create a cleaner corpus. This paper outlines a systematic literature-informed method that includes input from professors in the field, the combined use of tailored script in MATLAB and AntCont (Anthony, 2022) generated corpus and high-frequency words efficiently, and repeated comparison of original PDFs and the matching text files, then adding MATLAB script to deal with specific issues created by a cleaner text. This proposed method can be applied in other contexts to enhance the generation of high-frequency word lists.
Suggested citation
Hibiya Haraki, Stuart McLean, Suwako Uehara. (2022). Developing a discipline-specific corpus and high-frequency word list for science and engineering students in graduate school. Vocabulary Learning and Instruction, 11(2), 57–68. https://doi.org/10.7820/vli.v11.2.ueharaPlease wait while flipbook is loading. For more related info, FAQs and issues please refer to documentation.