Publications

Publication details [#31263]

Abstract

In the era of globalization, one may reasonably assume that the influence of English and international words is not restricted to lexicon, but that it also pervades other layers of language such as phraseology. Checking this hypothesis by means of corpora and of computational techniques poses, however, a number of methodological problems. In the first place, the representativeness of the corpora is hard to establish. Most phraseological units display a very low frequency per million words, which means that extensive corpora must be assembled for the different purposes of investigation. It is also unclear whether this problem requires a more corpus-based or more corpus-driven approach, with possible intermediate solutions. In the first case, the researcher decides which constructions will be checked against corpora, whereas the data themselves will serve as a guide in the second approach. This question is also intrinsically linked to the more general discussion on the boundaries between lexicon, grammar and phraseology, because it is hard to determine any proportion of an unknown total number of phraseological units. This will depend on the theoretical underpinnings of the research. As a possible compromise, an experiment in corpus-driven computational phraseology centered on the English adjective ‘digital’ is proposed. The use of a non-parametric statistical score and of comparable web-based corpora makes it possible to confirm the existence of phrases in other languages (in this case French and Spanish) that follow the general profile of the corresponding English word.
Source : Abstract in book