Methodological differences matter: Identification thresholds and corpus composition in lexical bundle research

Fan Pan

download PDF

Published:

Feb 1, 2021

Issue

Vol. 38 No. 4 (2020)

Section

Articles

Copyright for articles published in this journal is retained by the publisher.

Fan Pan

Abstract

In lexical bundle research, it has been a common practice to extract and compare lexical bundles across different corpora based on certain identification thresholds. This line of study adopts varying frequency and dispersion thresholds because the corpora compared always differ in the sizes and/or the numbers of texts. However, few studies have ever considered the consequences of these methodological differences. To bridge the gap, a series of experiments were conducted to explore the impact of identification thresholds and corpus composition on bundle extraction and the results of cross-corpora comparison. The first set of experiments demonstrated that different identification thresholds applied to the same pair of corpora may yield conflicting results, which indicated that the methodological differences could be one source of mixed results in the literature. Further, after removing the influence of differences in the sizes and/or the numbers of texts, the second set of experiments revealed that increasing the dispersion thresholds proportionally to offset the differences in the numbers of texts actually favours the corpus with a smaller number of texts. This study highlighted the interactive relationship between frequency thresholds and dispersion thresholds and the key role of dispersion thresholds in filtering bundles. The article also discusses the methodological implications for future contrastive lexical bundle research.

Southern African Linguistics and Applied Language Studies
Journal / Southern African Linguistics and Applied Language Studies / Vol. 38 No. 4 (2020) / Articles

Published:

Methodological differences matter: Identification thresholds and corpus composition in lexical bundle research

Fan Pan

Abstract

Journal Identifiers

Article Sidebar

Published:

Article Details

Main Article Content

Fan Pan

Abstract

Journal Identifiers