This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
A Methodology for Identification of the Formulaic Language Most Representative of High-frequency Collocations
James Rogers, Chris Brizzard, Frank Daulton, Cosmin Florescu, Ian MacLean, Kayo Mimura, John O’Donoghue, Masaya Okamoto, Gordon Reid, Yoshiaki Shimada
– Researchers have stated that learning formulaic language is key to achieving fluency.
Researchers have stated that learning formulaic language is key to achieving fluency. It has also been stated that studying vocabulary inthis way is more efficient than isolated vocabulary learning. However, there is a lack of research in regards to which formulaic language should be taught. There is a further lack of research about how such formulaic language can be identified. This study aimed to evaluate a methodologyfor identifying the most common formulaic language. It compared multiword unit identification results from both 500 and 1,000 examplesentences and quantified how often native speakers opt to extend multiword units beyond their core pivot and collocate. This study alsoidentified and quantified colligational issues affecting multi-word unit identification. The results showed no difference in multi-word unitidentification between 500 and 1,000 example sentences, that nativespeakers opted to extend multi-word units more than half of the time, andthat colligational issues only affected approximately 3% of the itemsexamined. This study concluded that 500 example sentences are just asreliable as 1,000 when identifying multi-word units. It also found thatextending multi-word units beyond their core pivot and collocate is anessential step researchers should take. This study also found that acolligational treatment is necessary if the aim is to achieve the most accurate data, however, the percentage of items that were affected weresmall and the methodology time-consuming. This finding indicates thatthere is a need for improved software to better automate the steps taken.
Suggested citationRogers, J., Brizzard, C., Daulton, F., Florescu, C., MacLean, I., Mimura, K., O'Donoghue, J., Okamoto, M., Reid, G., & Shimada, Y. (2014). A Methodology for Identification of the Formulaic Language Most Representative of High-frequency Collocations. Vocabulary Learning and Instruction, 3(1), 51–65. http://dx.doi.org/10.7820/vli.v03.1.rogers.et.al