General-purpose statistical translation engine and domain specific texts
Would it work ?
Michael Carl | Institut für Angewandte Informationsforschung
The past decade has witnessed exciting work in the field of Statistical Machine Translation (SMT). However, accurate evaluation of its potential in real-life contexts is still an open question. In this study, we investigate the behavior of an SMT engine faced with a corpus far different from the one it has been trained on. We show that terminological databases are obvious resources that should be used to boost the performance of a statistical engine. We propose and evaluate one way of integrating terminology into a SMT engine which yields a significant reduction in word error rate.
Cited by (3)
Cited by three other publications
Miyata, Rei & Kyo Kageura
Jimeno Yepes, Antonio, Élise Prieur-Gaston & Aurélie Névéol
2013.
Combining MEDLINE and publisher data to create parallel corpora for the automatic translation of biomedical text.
BMC Bioinformatics 14:1

Claveau, Vincent
2012.
Translation of Biomedical Terms by Inferring Rewriting Rules. In
Machine Learning,
► pp. 1417 ff.

This list is based on CrossRef data as of 10 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.