Chapter 10
Impact of word alignment on word translation entropy and other
metrics
A comparison of translation process research findings derived from
different word alignment methods
Many of the findings from studies using the Center for
Research and Innovation in Translation and Translation Technology (CRITT)
Translation Process Research Database (TPR-DB) framework rely on
word(s)-to-word(s) alignments of the source text and target text. However,
little research has been done on the impacts different alignment methods
have on these findings. This study compares two different manual word
alignment methods and four automatic word alignment methods on the basis of
one English-Spanish TPR-DB study that has been used extensively (the BML12
dataset). We replicate past findings from the BML12 dataset using these
different alignments in order to determine the impact of alignment, and we
present qualitative/quantitative analyses of the different word-alignment
methods.
Article outline
- 1.Introduction
- 2.Summary of past research
- 3.Procedure
- 3.1Manual alignment
- 3.2Automatic alignment
- 3.3Post-processing
- 4.Research questions 1 and 2: Qualitative analysis and replicating past studies
- 4.1Carl and Schaeffer
(2017)
- 4.2Toledo-Báez and Carl
(2020)
- 4.3Ogawa et al.
(2021)
- 5.Research question 3: Comparing measures across the alignment methods
- 6.Conclusions
- 6.1RQ1: Is new manual alignment more consistent?
- 6.2RQ2: Will alignment change results?
- 6.3RQ3: Alignment consistency
- 6.4Final remarks
-
Notes
-
References
-
Appendix
References (17)
References
Almazroei, Samar A., Haruka Ogawa, and Devin Gilbert. 2019. “Investigating
Correlations Between Human Translation and MT
Output.” In Proceedings
of the Second MEMENTO Workshop on Modelling Parameters of Cognitive
Effort in Translation
Production, 11–13. Dublin, Ireland: European Association for Machine Translation. [URL]
Bates, Douglas, Martin Mächler, Ben Bolker, and Steve Walker. 2015. “Fitting
Linear Mixed-Effects Models Using
Lme4.” Journal of Statistical
Software 67 (1): 1–48.
Carl, Michael. 2012. “Translog-II:
A Program for Recording User Activity Data for Empirical Reading and
Writing
Research.” In Proceedings
of the Eight International Conference on Language Resources and
Evaluation
(LREC’12), 4108–4112. Istanbul, Turkey: European Language Resources Association (ELRA).
Carl, Michael. 2021. “Information
and Entropy Measures of Rendered Literal
Translation.” In Explorations
in Empirical Translation Process Research, ed.
by Michael Carl, 113–40. Machine Translation: Technologies and Applications. Springer.
Carl, Michael, and Moritz Schaeffer. 2014. “Word
Transition Entropy as an Indicator for Expected Machine Translation
Quality.” In Proceedings
of the Workshop on Automatic and Manual Metrics for Operational
Translation Evaluation, ed.
by Keith J. Miller, Lucia Specia, Kim Harris, and Stacey Bailey, 45–50. Reykjavik, Iceland. [URL]
Carl, Michael, Moritz Schaeffer, and Srinivas Bangalore. 2016. “The
CRITT Translation Process Research
Database.” In New
Directions in Empirical Translation Process
Research, edited
by Michael Carl, Moritz Schaeffer, and Srinivas Bangalore, 13–54. Springer.
Carl, Michael, and Moritz Jonas Schaeffer. 2017. “Why
Translation Is Difficult: A Corpus-Based Study of Non-Literality in
Post-Editing and From-Scratch
Translation.” HERMES – Journal of
Language and Communication in
Business, no. 56 (October): 43–57.
Germann, Ulrich. 2008. “Yawat:
Yet Another Word Alignment
Tool.” In Proceedings
of the 46th Annual Meeting of the Association for Computational
Linguistics on Human Language Technologies: Demo
Session, 20–23. Association for Computational Linguistics.
Gilbert, Devin, and Michael Carl. 2021. “Introducing
a Word Alignment Dissimilarity Indicator: Alignment Links as
Conceptualizations of a Focused Bilingual
Lexicon.” In Proceedings
of the First Workshop on Modelling Translation: Translatology in the
Digital Age, ed.
by Yuri Bizzoni, Elke Teich, Cristina España i Bonet, and Josef van Genabith, 74–81. Online, Berlin: Association for Computational Linguistics. [URL]
Lüdecke, Daniel. 2021. SjPlot:
Data Visualization for Statistics in Social
Science (version R package version
2.8.7). [URL]
Mesa-Lao, Bartolomé. 2014. “Gaze
Behaviour on Source Texts: An Exploratory Study Comparing
Translation and
Post-Editing.” In Post-Editing
of Machine Translation: Processes and
Applications, 219–45. Copenhagen Business School.
Och, Franz Josef, and Hermann Ney. 2003. “A
Systematic Comparison of Various Statistical Alignment
Models.” Computational
Linguistics 29 (1): 19–51.
Ogawa, Haruka, Devin Gilbert, and Samar A. Almazroei. 2021. “redBird:
Rendering Entropy Data and ST-Based Information into a Rich
Discourse on Translation: Investigating Relationships between MT
Output and Human
Translation.” In Explorations
in Empirical Translation Process Research, ed.
by Michael Carl, 141–63. Machine Translation: Technologies and Applications. Springer.
R Core
Team. 2017. R: A
Language and Environment for Statistical
Computing (version
3.6.1). Vienna, Austria: R Foundation for Statistical Computing. [URL]
Sabet, Masoud Jalili, Philipp Dufter, François Yvon, and Hinrich Schütze. 2020. “SimAlign:
High Quality Word Alignments without Parallel Training Data Using
Static and Contextualized
Embeddings.” In EMNLP
(Findings) 2020: ArXiv:2004.08728
[Cs]. Online:
ArXivLabs. [URL]
Schaeffer, Moritz, and Michael Carl. 2014. “Measuring
the Cognitive Effort of Literal Translation
Processes.” In Proceedings
of the Workshop on Humans and Computer-Assisted Translation
(HaCaT), ed. by Ulrich Germann, Michael Carl, Philipp Koehn, Germán Sanchis Trilles, Francisco Casacuberta, Robin Hill, and Sharon O’Brien, 29–37. Stroudsburg, Pennsylvania, USA: Association for Computational Linguistics.
Toledo-Báez, M. Cristina, and Michael Carl. 2020. “Assessing
Low and High Translation Variation in
Post-Editing.” In TT5
Translation in Transition Book of
Abstracts, 41–45. Kent State University, Kent, Ohio, USA. [URL]
Cited by (1)
Cited by one other publication
Carl, Michael
2024.
An Active Inference Agent for Modeling Human Translation Processes.
Entropy 26:8
► pp. 616 ff.
This list is based on CrossRef data as of 4 august 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.