Article published In:
Languages in Contrast: Online-First ArticlesCorpus-based contrastive studies and AI-generated translations
This article addresses the issue of using AI-generated translations to perform contrastive analysis. The aim is to
establish whether bidirectional translation corpora (and by extension human translators) have become superfluous, given the
success of AI-generated translations. To investigate this, results from a previous study of English bring and Norwegian
bringe based on bidirectional data are compared with results from a study based on AI-generated data. The
AI-generated translations show a markedly higher Mutual Correspondence between the verbs than the human translations. This
AI-translation effect may give an inaccurate picture of how equivalent the verbs really are. In cases where AI and human
translations deviate, the latter are characterised by semantically more specific verbs. The study highlights some features only
available in bidirectional corpora, including the possibility of taking individual variation into account. The findings suggest
that bidirectional corpora (and human translators) still have a role to play in contrastive studies.
Article outline
- 1.Introduction
- 2.Bring vs. bringe in the English-Norwegian Parallel Corpus
- 3.AI-generated translation
- 4.Procedure of generating translations with GPT-4
- 5.Contrastive analysis of bring and bringe based on AI-generated vs. human translations
- 5.1Quantitative overview
- 5.2Quantitative and qualitative analysis of patterns of use
- 5.2.1The ditransitive pattern
- 5.2.2The monotransitive pattern
- 5.2.3The complex transitive pattern
- 5.2.4Fixed phrases
- 6.Limitations of the study
- 7.Discussion
- 8.Concluding remarks
- Acknowledgements
- Notes
-
References -
Corpus
Published online: 20 September 2024
https://doi.org/10.1075/lic.00051.ebe
https://doi.org/10.1075/lic.00051.ebe
References (32)
Ahrenberg, L. 2017. Comparing
Machine Translation and Human Translation: A Case Study. Proceedings of the Workshop
Human-Informed Translation and Interpreting Technology (HiT-IT). Varna, Bulgaria, 7 September
2017. Association for Computational Linguistics. 21–28.
Altenberg, B. 1999. Adverbial
connectors in English and Swedish: Semantic and lexical
correspondences. In Out of Corpora: Studies in Honour of Stig
Johansson, H. Hasselgård and S. Oksefjell (eds), 249–268. Amsterdam/Atlanta: Rodopi.
Baker, M. 1993. Corpus
linguistics and translation studies. Implications and
applications. In Text and Technology: In Honour of John
Sinclair, M. Baker, G. Francis and E. Tognini-Bonelli (eds), 233–250. Amsterdam/Philadelphia: John Benjamins Publishing Company.
2004. A
corpus-based view of similarity and difference in translation. International Journal of Corpus
Linguistics 9(2):167–194.
Bokmålsordboka. 2024. The Language
Council of Norway and the University of Bergen. Available at [URL] [last
accessed 15 January
2024].
Brahmbhatt, A. 2023. GPT-3.5
vs GPT-4: an in-depth analysis of OpenAI’s language models. Available at [URL] [last
accessed 5 April
2024].
Brown, P. F., Della Pietra, S. A., Della Pietra, V. J. and Mercer, R. L. 1990. The
mathematics of Statistical Machine Translation: Parameter Estimation. Computational
Linguistics 19(2), 263–311.
Cambridge Dictionary. 2024. English
Dictionary Online. Cambridge University Press. Available
at [URL] [last accessed 15 January 2024].
Ebeling, J. 1998. The
Translation Corpus Explorer: A browser for parallel texts. In Corpora
and Cross-linguistic Research: Theory, Method and Case Studies, S. Johansson and S. Oksefjell (eds), 101–112. Amsterdam: Rodopi.
Ebeling, J. and Ebeling, S. O. 2015. An
English-Norwegian contrastive analysis of downtoners, more or less. Nordic Journal of English
Studies
(NJES)
14
(1): 62–89.
Ebeling, S. O. 2017. Bringing
home the bacon! A contrastive study of the cognates bring/bringe in English and
Norwegian. Kalbotyra 7001: 104–126.
Gellerstam, M. 1986. Translationese
in Swedish novels translated from English. In Translation Studies in
Scandinavia, L. Wollin and H. Linquist (eds), 88–95. Lund: CWK Gleerup.
Hasselgård, H. 2010. Contrastive
analysis / contrastive linguistics. In The Routledge Linguistics
Encyclopedia, Third Edition, K. Malmkjær (ed.), 98–101. London: Routledge.
Hendy, A., Abdelrehim, M., Sharaf, A., Raunak, V., Gabr, M., Matsushita, H., Kim, Y. J., Afify, M. and Awadalla, H. 2023. How
good are GPT models at machine translation? A comprehensive evaluation. Available at [URL] [last
accessed 8 April
2024].
Hutchins, W. J. 2001. Machine
translation over fifty years. Histoire Épistémologie
Langage 23(1), 7–31.
Johansson, S. 2007. Seeing
Through Multilingual Corpora: On the Use of Corpora in Contrastive
Studies. Amsterdam/Philadelphia: John Benjamins.
Kanade, V. 2023. What
is ChatGPT? Characteristics, uses, and alternatives. Available at [URL] [last
accessed 4 April
2024].
Klaudy, K. and Károly, K. 2005. Implicitation
in translation: Empirical evidence for operational asymmetry in translation. Across Languages
and
Cultures 6(1), 13–28.
Mandal, S. A. 2019. Evolution
of Machine Translation. Towards Data Science. Available
at [URL] [last
accessed 13 July
2024].
Marr, B. 2023. The
Top 10 Limitations of ChatGPT. Available at [URL] [last
accessed 5 April
2024].
Muftah, M. 2022. Machine
vs. human translation: A new reality or a threat to professional Arabic-English translators. PSU Research Review, Emerald
Publishing Limited. Available at [URL] [last
accessed 5 April
2024].
NAOB. 2024. Det Norske Akademis
Ordbok. Det Norske Akademi for Språk og Litteratur. Available
at [URL] [last
accessed 15 January
2024].
OED. 2017 / 2024. Oxford English
Dictionary. Oxford University Press. Available
at [URL] [accessed 25 July 2017 / 15 January
2024].
Quirk, R., Greenbaum, S., Leech, G. and Svartvik, J. 1985. A
Comprehensive Grammar of the English
Language. London: Longman.
Sinclair, J. 1999. A
way with common words. In Out of Corpora: Studies in Honour of Stig
Johansson, H. Hasselgård and S. Oksefjell (eds), 157–179. Amsterdam: Rodopi.
Summa Linguæ. 2021. Rule-Based vs.
Statistical vs. Neural Machine Translation. Available at [URL] [last
accessed 4 April
2024].
Vanmassenhove, E., Shterionov, D. and Gwilliam, M. 2021. Machine
translationese: Effects of algorithmic bias on linguistic complexity in machine
translation. Proceedings of the 16th Conference of the European Chapter of the Association for
Computational Linguistics: Main
Volume, 2203–2213. Available at [URL] [last accessed 4 April 2024].
Wikipedia: The Free
Encyclopedia. 2024. GPT-4. Available
at [URL] [last accessed 13 July 2024].
English-Norwegian Parallel
Corpus (1994–1997), Dept. of British and American Studies, University of
Oslo. Compiled by Stig Johansson (project
leader), Knut Hofland (project
leader), Jarle Ebeling (research
assistant), Signe Oksefjell (research
assistant). Available at [URL] [last
accessed 15 January
2024].