The more proficient the learners, the less sophisticated their L2 vocabulary?
The curious effect of the reference corpus on mean-frequency measures of lexical sophistication
Mean-frequency scores of lexical sophistication are used to evaluate written and spoken language production. They
are calculated using word frequencies extracted from a reference corpus. Using mixed-effects regression models, we analyse the
strength of the relationship between L2 proficiency and mean-frequency scores in spoken and written texts using reference corpora
representing different modes and registers. We control for task and topic effects. We observe that mean-frequency measures of
lexical sophistication are considerably more influenced by the mode and register of the reference corpus used to calculate these
scores than by language users’ proficiency level. Advanced language users produce more frequent vocabulary, typical of the target
register, in both spoken monologues and written essays. These results provide evidence in favour of a conceptual and
terminological shift from
lexical sophistication to
register appropriateness (as suggested by
Durrant & Brenchley, 2019) to refer to the construct captured by mean-frequency
scores of vocabulary use.
Article outline
- 1.Introduction
- 2.The problem
- 3.Hypotheses
- 4.Methodology
- 5.Results
- 5.1Results from the ICNALE spoken monologues
- 5.2Results from the ICNALE Written Essays
- 6.Discussion
- 6.1Hypothesis 1: The relationship between L2 proficiency and mean-frequency scores of content words (CW)
- 6.2Hypothesis 2: The effect of mode and register of the reference corpus on mean-frequency scores of content words (CW)
- 6.2.1The ICNALE spoken monologues
- 6.2.2The ICNALE Written Essays
- 6.3Limitations and future research
- 7.Conclusions
- Open data badge and open code badge
- Notes
-
References