Special Issue Article
Theorizing sustainable, low-resource MT in development settings
Pivot-based MT between Guatemala’s indigenous Mayan languages
This article conducts a meta-analysis of existing research to theorize how machine translation (MT) may help resolve underlying contradictions in the development sector that preclude the UN’s 10th Sustainable Development Goal: to reduce inequality within and among countries. Non-governmental organizations (NGOs) frequently work in dominant languages and neglect marginalized languages, reinforcing power imbalances between the Global North and Global South in development planning. MT between marginalized languages may improve collaboration between local communities to redress shared disadvantages. As an example, the article hypothesizes a sustainable, “low-tech” MT system pivoting through Spanish to translate between three Mayan languages in Guatemala: K’iche’, Q’eqchi’, and Mam. First, the article theorizes three key dimensions comprising the overall sustainability of low-resource MT in development: quality, social, and environmental. It then evaluates the sustainability of various MT architectures. Finally, it reaffirms the ability for indirect translation (classic pivot-based MT) to facilitate MT between low-resource languages.
Article outline
- 1.Introduction
- 2.Defining sustainable MT workflows for low-resource languages
- 2.1Previous work on sustainable MT
- 2.2Three sustainability dimensions for MT workflows for low-resource languages
- 3.Evaluating the sustainability of MT architectures for low-resource languages
- 3.1Rule-based MT
- 3.2Statistical MT
- 3.3Classic neural MT
- 3.4Multilingual neural MT
- 3.5Hybrid MT
- 3.6Evaluation results
- 4.Pivot-based MT as sustainable MT
- 5.Identifying additional challenges and mitigating strategies
- 5.1Potential training data sets
- 5.2Insufficient training data after pivoting
- 5.3Morphological complexity
- 5.4Inconsistent orthography
- 6.Limitations
- 7.Conclusion
-
References
References (62)
References
Aissen, Judith, Nora C. England, and Roberto Zavala Maldonado. 2017. “Introduction.” In The Mayan Languages, edited by Judith Aissen, Nora C. England, and Roberto Zavala Maldonado, 1–15. Routledge Language Family Series. London and New York: Routledge. 

Alvstad, Cecilia. 2017. “Arguing for Indirect Translations in Twenty-First-Century Scandinavia.” Translation Studies 10 (2): 150–65. 

Assis Rosa, Alexandra, Hanna Pięta, and Rita Bueno Maia. 2017. “Theoretical, Methodological and Terminological Issues Regarding Indirect Translation: An Overview.” Translation Studies 10 (2): 113–32. 

Banerjee, Subhabrata Bobby. 2003. “Who Sustains Whose Development? Sustainable Development and the Reinvention of Nature.” Organization Studies 24 (1): 143–80. 

Bapna, Ankur, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, et al. 2022. “Building Machine Translation Systems for the Next Thousand Languages.” arXiv:2205.03983. Google. 

Bjørn, Anders, Shannon M. Lloyd, Matthew Brander, and H. Damon Matthews. 2022. “Renewable Energy Certificates Threaten the Integrity of Corporate Science-Based Targets.” Nature Climate Change 12 (6): 539–46. 

Bojar, Ondrˇej, and Alesˇ Tamchyna. 2011. “Improving Translation Model by Monolingual Data.” In Proceedings of the Sixth Workshop on Statistical Machine Translation, 71.
Bollmann, Marcel. 2019. “A Large-Scale Comparison of Historical Text Normalization Systems.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 3885–98. Minneapolis: Association for Computational Linguistics. 

Bowker, Lynne, and Jairo Buitrago Ciro. 2019. Machine Translation and Global Research: Towards improved machine translation literacy in the scholarly community. Bingley: Emerald Publishing.
Brander, Matthew, Michael Gillenwater, and Francisco Ascui. 2018. “Creative Accounting: A Critical Perspective on the Market-Based Method for Reporting Purchased Electricity (Scope 2) Emissions.” Energy Policy 1121 (January): 29–33. 

Castilho, Sheila, Joss Moorkens, Federico Gaspari, Iacer Calixto, John Tinsley, and Andy Way. 2017. “Is Neural Machine Translation the New State of the Art?” The Prague Bulletin of Mathematical Linguistics 108 (1): 109–20. 

Cheng, Yong, Yang Liu, Qian Yang, Maosong Sun, and Wei Xu. 2017. “Neural Machine Translation with Pivot Languages.” ArXiv.Org, February. 

Chiang, David. 2005. “A Hierarchical Phrase-Based Model for Statistical Machine Translation.” In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, 263–70. Ann Arbor, Michigan: Association for Computational Linguistics. 

Chiapello, Eve, and Anita Engels. 2021. “The Fabrication of Environmental Intangibles as a Questionable Response to Environmental Problems.” Journal of Cultural Economy 14 (5): 517–32. 

Chipidza, Wallace, and Dorothy Leidner. 2019. “A Review of the ICT-Enabled Development Literature: Towards a Power Parity Theory of ICT4D.” The Journal of Strategic Information Systems, SI: Review issue, 28 (2): 145–74. 

Conneau, Alexis, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. “Unsupervised Cross-Lingual Representation Learning at Scale.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8440–51. Online: Association for Computational Linguistics. 

Cronin, Michael. 2017. Eco-Translation: Translation and Ecology in the Age of the Anthropocene. London: Routledge. 

Costa-jussà, Marta R., Mireia Farrús, José B. Mariño Acebal, and José A. Rodriguez Fonollosa. 2012. “Study and Comparison of Rule-Based and Statistical Catalan-Spanish Machine Translation Systems.” Computing and Informatics 31 (2).
Costa-jussà, Marta R., James Cross, Onur Çelebi, Maha Elbayad, Kenneth Heafield, Kevin Heffernan, Elahe Kalbassi, et al. 2022. “No Language Left Behind: Scaling Human-Centered Machine Translation.” arXiv. 

Doss Mohan, Krishna, and Jann Skotdal. 2021. “Microsoft Translator: Now Translating 100 Languages and Counting!” Microsoft Research (blog). October 11, 2021. [URL]. Accessed 27 January, 2023.
England, Nora C. 1998. “Mayan Efforts toward Language Preservation.” In Endangered Languages: Language Loss and Community Response, edited by Lenore A. Grenoble and Lindsay J. Whaley, 99–116. Cambridge: Cambridge University Press. 

España Bonet, Cristina, Lluís Màrquez Villodre, Gorka Labaka, Arantza Díaz de Ilarraza Sánchez, and Kepa Sarasola Gabiola. 2011. “Hybrid Machine Translation Guided by a Rule-Based System.” In Proceedings of Machine Translation Summit XIII: Papers, 554–61. Xiamen. [URL]
Forcada, Mikel L., Mireia Ginestí-Rosell, Jacob Nordfalk, Jim O’Regan, Sergio Ortiz-Rojas, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Gema Ramírez-Sánchez, and Francis M. Tyers. 2011. “Apertium: A Free/Open-Source Platform for Rule-Based Machine Translation.” Machine Translation 25 (2): 127–44. 

French, Brigittine M. 2010. Maya Ethnolinguistic Identity: Violence, Cultural Rights, and Modernity in Highland Guatemala. Tucson: University of Arizona Press.
García Ixmatá, Ajpub’ Pablo. 2010. “Maya Knowledge and Wisdom.” Translated by Laura Martin. Latin American and Caribbean Ethnic Studies 5 (2): 219–31. 

Gasser, Mike. 2006. “Machine Translation and the Future of Indigenous Languages.” In I Congreso Internacional de Las Lenguas y Literaturas Indoamericanas, 201.
Gillooly, Shauna N. 2020. “Indigenous Social Movements and Political Institutionalization: A Comparative Case Study.” Politics, Groups, and Identities 8 (5): 1006–21. 

Heilbron, Johan. 2010. “Structure and Dynamics of the World System of Translation.” In UNESCO International Symposium ‘Translation and Cultural Mediation. [URL]
Henderson, Brent, Peter Rohloff, and Robert Henderson. 2014. “More than Words: Towards a Development-Based Approach to Language Revitalization.” Language Documentation & Conservation 81 (March): 75–91.
Hutchins, John. 2005. “Example-Based Machine Translation: A Review and Commentary.” Machine Translation 19 (3): 197–211. 

Iglesias, Gonzalo, Adrià de Gispert, Eduardo R. Banga, and William Byrne. 2009. “Rule Filtering by Pattern for Efficient Hierarchical Translation.” In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics on - EACL ’09, 380–88. Athens, Greece: Association for Computational Linguistics. 

Johnson, Melvin, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, et al. 2017. “Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation.” arXiv, August. [URL]. 
“Kari-Oca II Declaration.” 2012. In Indigenous Peoples Global Conference on Rio+20 and Mother Earth. Kari-Oka Village, at Sacred Kari-Oka Púku, Rio de Janeiro, Brazil. [URL]. Accessed 27 January, 2023.
Kasperė, Ramunė, Jolita Horbačauskienė, Jurgita Motiejūnienė, Vilmantė Liubinienė, Irena Patašienė, and Martynas Patašius. 2021. “Towards Sustainable Use of Machine Translation: Usability and Perceived Quality from the End-User Perspective.” Sustainability 13 (23): 1–17. 

Khanna, Tanmai, Jonathan N. Washington, Francis M. Tyers, Sevilay Bayatlı, Daniel G. Swanson, Tommi A. Pirinen, Irene Tang, and Hèctor Alòs i Font. 2021. “Recent Advances in Apertium, a Free/Open-Source Rule-Based Machine Translation Platform for Low-Resource Languages.” Machine Translation 35 (4): 475–502. 

Klaaßen, Lena, and Christian Stoll. 2021. “Harmonizing Corporate Carbon Footprints.” Nature Communications 12 (1): 6149. 

Koehn, Philipp, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, Evan Herbst, Hieu Hoang, et al. 2007. “Moses: Open Source Toolkit for Statistical Machine Translation.” In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, 177–180. Prague: Association for Computational Linguistics. 

Koehn, Philipp, and Rebecca Knowles. 2017. “Six Challenges for Neural Machine Translation.” In Proceedings of the First Workshop on Neural Machine Translation, 28–39. Vancouver: Association for Computational Linguistics. 

Leal, Pablo Alejandro. 2007. “Participation: The Ascendancy of a Buzzword in the Neo-Liberal Era.” Development in Practice 17 (4–5): 539–48. 

Lewis, William D. 2010. “Haitian Creole: How to Build and Ship an MT Engine from Scratch in 4 Days, 17 Hours, & 30 Minutes,” 61.
Mager Hois, Jesús Manuel, and Ivan Vladimir Meza Ruiz. 2018. “Hacia La Traducción Automática de Las Lenguas Indígenas de México.” In Proceedings of the 2018 Digital Humanities Conference, 637–39. Mexico City.
Marais, Kobus. 2014. Translation Theory and Development Studies: A Complexity Theory Approach. New York: Routledge. 

Moorkens, Joss. 2020. “Translation in the Neoliberal Era.” In The Routledge Handbook of Translation and Globalization, edited by Esperança Bielsa and Dionysios Kapsaskis. Abingdon: Routledge. 

McDonald, Joseph, Baolin Li, Nathan Frey, Devesh Tiwari, Vijay Gadepally, and Siddharth Samsi. 2022. “Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models.” arXiv. [URL]
Nekoto, Wilhelmina, Vukosi Marivate, Tshinondiwa Matsila, Timi Fasubaa, Taiwo Fagbohungbe, Solomon Oluwole Akinola, Shamsuddeen Muhammad, et al. 2020. “Participatory Research for Low-Resourced Machine Translation: A Case Study in African Languages.” In Findings of the Association for Computational Linguistics: EMNLP 2020, 2144–2160. Online: Association for Computational Linguistics. 

Oflazer, Kemal, and Ilknur Durgar El-Kahlout. 2007. “Exploring Different Representational Units in English-to-Turkish Statistical Machine Translation.” In Proceedings of the Second Workshop on Statistical Machine Translation, 25–32. Prague: Association for Computational Linguistics. 

Öktem, Alp, Muhannad Albayk Jaam, Eric DeLuca, and Grace Tang. 2020. “Gamayun – Language Technology for Humanitarian Response.” In 2020 IEEE Global Humanitarian Technology Conference (GHTC), 1–4. 

Patterson, David, Joseph Gonzalez, Urs Hölzle, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David R. So, Maud Texier, and Jeff Dean. 2022. “The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink.” Computer 55 (7): 18–28. 

Rico, Celia, María Del Mar Sánchez Ramos, and Antoni Oliver. 2020. “INMIGRA3: Building a Case for NGOs and NMT.” In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 469–70. Lisbon: European Association for Machine Translation. [URL]
Riemland. 2022. “Translation and Technocracy in Development: Defining the potentials and limitations of translation technology for Maya inclusion in Guatemalan development.” Linguistica Antverpiensia, New Series: Themes in Translation Studies 211: 203–221.
Romero, Sergio. 2017. “The labyrinth of diversity: The sociolinguistics of Mayan languages.” In The Mayan Languages, edited by Judith Aissen, Nora C. England, and Roberto Zavala Maldonado, 379–400. Routledge Language Family Series. London and New York: Routledge. 

Shterionov, Dimitar, and Eva Vanmassenhove. 2023. “The Ecological Footprint of Neural Machine Translation Systems.” In Towards Responsible Machine Translation: Ethical and Legal Considerations in Machine Translation, edited by Helena Moniz and Carla Parra Escartín, 185–213. Cham: Springer International Publishing AG. 

Strubell, Emma, Ananya Ganesh, and Andrew McCallum. 2019. “Energy and Policy Considerations for Deep Learning in NLP.” In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645–50. Florence. 

Tesseur, Wine. 2022. Translation as Social Justice: Translation Policies and Practices in Non-Governmental Organisations. New Perspectives in Translation and Interpreting Studies 5. New York: Routledge. 

Torregrosa, Daniel, Nivranshu Pasricha, Bharathi Raja Chakravarthi, Juan Alonso, Noe Casas, Maraim Masoud, and Mihael Arcan. 2019. “Leveraging Rule-Based Machine Translation Knowledge for Under-Resourced Neural Machine Translation Models.” In Proceedings of Machine Translation Summit XVII, 21:125–33. Dublin.
Tyers, Francis, and Robert Henderson. 2021. “A Corpus of K’iche’ Annotated for Morphosyntactic Structure.” In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, 10–20. Online: Association for Computational Linguistics. 

Wang, Wei, Taro Watanabe, Macduff Hughes, Tetsuji Nakagawa, and Ciprian Chelba. 2018. “Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection.” In 2018 Third Conference on Machine Translation, 133–43. Belgium. [URL]. 
Webb, Meghan, and Miguel Cuj. 2020. “Guatemala’s Public Health Messaging in Mayan Languages during the COVID-19 Pandemic.” Journal of Indigenous Social Development 9 (3): 102–9.
Zhou, Zhong, and Alex Waibel. 2021. “Active Learning for Massively Parallel Translation of Constrained Text into Low Resource Languages.” In Proceedings of the LoResMT Workshop of the 18th Biennial Machine Translation Summit in 2021. 

Cited by (1)
Cited by one other publication
Di Nunzio, Giorgio Maria, Eszter Papp, Federica Vezzani & Ellie Kemp
2024.
FAIR Terminology Meets CLEAR Global. In
Linking Theory and Practice of Digital Libraries [
Lecture Notes in Computer Science, 15178],
► pp. 173 ff.

This list is based on CrossRef data as of 28 september 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.