Interference and normalization in genre-controlled multilingual corpora
Capturing the distinction between translated vs. original (i.e. non-translated) language varieties holds centre stage in corpus-based translation studies and related fields such as supervised machine learning. A similar question also holds for native vs. proficient non-native speakers' production. In both cases, the linguistic features that seem to be good indicators of the different language varieties appear to be genre-dependent. In the articles included in this volume, genre-controlled multilingual corpora are used to identify and measure two competing properties of both translational and non-native language varieties: (i) source (or native) language interference and (ii) normalization, which can be described as a tendency to fit into target-language standards. The topics addressed include linguistic features to uncover and quantify interference and normalization within a specific genre and across genres, the complementarity of comparable vs. parallel corpus data and experimental vs. corpus data to investigate interference and normalization, and the extraction of highly similar and homogeneous comparable and parallel corpora from multilingual resources such as Europarl. A wide range of genres are examined, e.g. research articles abstracts, parliamentary debates, administrative texts, fiction.
[Belgian Journal of Linguistics, 27] 2013. v, 134 pp.
Publishing status: Available
© John Benjamins Publishing Company
Table of Contents