Multiword Units in Machine Translation and Translation Technology
Editors
The correct interpretation of Multiword Units (MWUs) is crucial to many applications in Natural Language Processing but is a challenging and complex task. In recent years, the computational treatment of MWUs has received considerable attention but there is much more to be done before we can claim that NLP and Machine Translation (MT) systems process MWUs successfully.
This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.
This volume provides a general overview of the field with particular reference to Machine Translation and Translation Technology and focuses on languages such as English, Basque, French, Romanian, German, Dutch and Croatian, among others. The chapters of the volume illustrate a variety of topics that address this challenge, such as the use of rule-based approaches, compound splitting techniques, MWU identification methodologies in multilingual applications, and MWU alignment issues.
[Current Issues in Linguistic Theory, 341] 2018. ix, 259 pp.
Publishing status: Available
Published online on 12 July 2018
Published online on 12 July 2018
© John Benjamins
Table of Contents
-
About the editors | pp. vii–x
-
Multiword units in machine translation and translation technologyJohanna Monti, Violeta Seretan, Gloria Corpas Pastor and Ruslan Mitkov | pp. 1–38
-
Part 1. Multiword units in machine translation
-
Analysing linguistic information about word combinations for a Spanish-Basque rule-based machine translation systemUxoa Iñurrieta, Itziar Aduriz, Arantza Díaz Ilarraza, Gorka Labaka and Kepa Sarasola | pp. 39–60
-
How do students cope with machine translation output of multiword units? An exploratory studyJoke Daems, Michael Carl, Sonia Vandepitte, Robert Hartsuiker and Lieve Macken | pp. 61–80
-
Aligning verb + noun collocations to improve a French-Romanian FSMT systemAmalia Todirascu and Mirabela Navlea | pp. 81–100
-
Part 2. Multiword units in multilingual NLP applications
-
Multiword expressions in multilingual information extractionGregor Thurmair | pp. 101–124
-
A multilingual gold standard for translation spotting of German compounds and their corresponding multiword units in English, French, Italian and SpanishSimon Clematide, Stéphanie Lehner, Johannes Graën and Martin Volk | pp. 125–146
-
Dutch compound splitting for bilingual terminology extractionLieve Macken and Arda Tezcan | pp. 147–162
-
Part 3. Identification and translation of multiword units
-
A flexible framework for collocation retrieval and translation from parallel and comparable corporaOscar Mendoza Rivera, Ruslan Mitkov and Gloria Corpas Pastor | pp. 163–180
-
On identification of bilingual lexical bundles for translation purposes: The case of an English-Polish comparable corpus of patient information leafletsŁukasz Grabowski | pp. 181–200
-
The quest for croatian idioms as multiword unitsKristina Kocijan and Sara Librenjak | pp. 201–222
-
Corpus analysis of croatian constructions with the verb doći ‘to come’Goranka Blagus Bartolec and Ivana Matas Ivanković | pp. 223–242
-
Anaphora resolution, collocations and translationEric Wehrli and Luka Nerima | pp. 243–256
-
Index
“[T]he book represents many interesting topics in the area of computational treatment of multiword expressions, with a special focus on MT and translation technology. [...] This book can essentially be viewed as an important contribution to a specialised area (i.e. computational treatment of MWUs) of interest, which will be a great help to NLP researchers, and MT researchers and users in particular.”
Rejwanul Haque, Andy Way, Dublin City University and Mohammed Hasanuzzaman, Cork Institute of Technology, Ireland, in Machine Translation Vol. 33 (2019)
“[T]he accuracy of MWU translations still remains a problem, and MWU processing and translation still pose the hardest challenges to MT and translation technology (TT). [...] [T]he book definitely makes an important contribution to MWU processing, thanks to the new angle it brings to the study of MWU in NLP and the diverse and innovative models for the computational treatment of MWU.”
Wang Hui and Zhang Xiaojun, Xi'an Jiaotong-Liverpool University, in Babel 65:5 (2019)
“The book covers the treatments of different types of MWUs (e.g. idioms, lexical bundles, collocations, compounds) in translation (i.e. by MT system or CAT software) and a number of use-cases. The first chapter by the editors of the book presents an extensive background study and survey on the computational treatment of MWUs in NLP applications (particularly with respect to different MT approaches).”
Rejwanul Haque, Dublin City University, Mohammed Hasanuzzaman, Cork Institute of Technology, and Andy Way, Dublin City University, in Machine Translation 33 (2021).
Cited by (11)
Cited by 11 other publications
Corpas Pastor, Gloria & Laura Noriega-Santiáñez
Cuadrado Rey, Analía & Lucía Navarro Brotons
Gu, Chonglong & Dechao Li
Kübler, Natalie, Hanna Martikainen, Alexandra Mestivier & Mojca Pecman
2024. Post-editing neural machine translation in specialised languages. In Recent Advances in Multiword Units in Machine Translation and Translation Technology [Current Issues in Linguistic Theory, 366], ► pp. 57 ff.
Cabezas-García, Melania & Pilar León-Araúz
Lima Florido, Francisco Javier
Sánchez Rodas, Fernando
Liu, Kanglong & Muhammad Afzaal
Haque, Rejwanul, Mohammed Hasanuzzaman & Andy Way
Monti, Johanna & Maria Pia di Buono
This list is based on CrossRef data as of 2 january 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
Subjects
Linguistics
Translation & Interpreting Studies
Main BIC Subject
CFK: Grammar, syntax
Main BISAC Subject
LAN009060: LANGUAGE ARTS & DISCIPLINES / Linguistics / Syntax