Psychometric properties of survey translations
A simulation study
Psychometrics involves the indirect measurement of latent constructs, including aspects of cognition and emotion,
and Likert-type scales are a common tool to operationalize and quantify these constructs. One threat to the psychometric
properties of such scales is the administration of surveys across multiple languages, which presupposes the translation of the
survey instruments. While multiple recommendations exist on best practices in translation, implementation does not always satisfy
such guidelines. This article employs Monte Carlo simulation to explore the potential effects of translation on survey measurement
and psychometric properties. Three possible challenges are explored, namely ambiguity, shift in valence, and issues with negation.
These translation effects are statistically modeled as increased variance, change in skewness, and reverse coding, respectively.
Additionally, the simulation examines the value of multi-item scales over single-item measurement. Overall, the results illustrate
how survey translation can impact exploratory factor analysis and reliability of measurement.
Article outline
- 1.Introduction
- 2.Literature review
- 2.1Measurement of latent constructs
- 2.2Scale translation and adaptation
- 2.3Statistical challenges: Single-item surveys and attenuation effects
- 2.4Hypotheses
- 3.Methods
- 3.1Factor loadings and calibration
- 3.2Simulated translation effects
- 3.3Output measures
- 3.4Comparison of single-item and multi-item scales
- 4.Results
- 5.Discussion
- 5.1Limitations and future research
- 6.Conclusion
-
References
References (88)
References
Allen, Mark S., Dragos Iliescu, and Samuel Greiff. 2022. “Single
item measures in psychological science: A call to action.” European Journal of Psychological
Assessment 38 (1): 1–5.
Barrett, Lisa F. 2017. “The theory of constructed
emotion: An active inference account of interoception and categorization.” Social Cognitive and
Affective
Neuroscience 12 (1): 1–23.
Beaton, Dorcas E., Claire Bombardier, Francis Guillemin, and Marcos Bosi Ferraz. 2000. “Guidelines
for the process of cross-cultural adaptation of self-report
measures.” Spine 25 (24): 3186–3191.
Behr, Dorothée, and Mandy Sha. 2018. “Introduction:
Translation of questionnaires in cross-national and cross-cultural research.” Translation &
Interpreting 10 (2): 1–4.
Behr, Dorothée, and Kuniaki Shishido. 2016. “The
translation of measurement instruments for cross-cultural
surveys.” In The SAGE Handbook of Survey
Methodology, edited by Christof Wolf, Dominique Joye, Tom W. Smith, and Yang-Chih Fu, 269–287. Thousand Oaks, CA: Sage.
Bindak, Recep. 2013. “Relationship
between randomness and coefficient alpha: A Monte Carlo simulation study.” Journal of Data
Analysis and Information
Processing 1 (2): 13–17.
Bolaños-Medina, Alicia, and Victor González Ruis. 2012. “Deconstructing
the translation of psychological
tests.” Meta 57 (3): 715–739.
Byrne, Barbara M. 2008. “Testing for multigroup
equivalence of a measuring instrument: A walk through the
process.” Psicothema 20 (4): 872–882.
Calvo, Elisa, and Catherine Way. 2024. “Translating
psychometric tests: When skopos gets lost in literality.” In A
Qualitative Approach to Translation Studies: Spotlighting Translation Problems, edited
by Elisa Calvo and Elena de la Cova, 228–245. New York: Routledge.
Chen, Peiyao, Jie Lin, Bingle Chen, Chunming Lu, and Taomei Guo. 2015. “Processing
emotional words in two languages with one brain: ERP and fMRI evidence from Chinese-English
bilinguals.” Cortex 711: 34–48.
Colina, Sonia, Nicole Marrone, Maia Ingram, and Daisey Sánchez. 2017. “Translation
quality assessment in health research: A functionalist alternative to
back-translation.” Evaluation & the Health
Professions 40 (3): 267–293.
Cuvillier, Matthieu, Pierre-Majorique Léger, and Sylvain Sénécal. 2021. “Quantity
over quality: Do single-item scales reflect what users truly experienced?” Computers in Human
Behavior Reports 41: 100097.
Daouk-Öyry, Lina, and Pia Zeinoun. 2017. “Testing
across cultures: Translation, adaptation, and indigenous test
development.” In Psychometric Testing: Critical
Perspectives, edited by Barry Cripps, 221–233. Malden, MA: Wiley.
de Winter, J. C. F., D. Dodou, and P. A. Wieringa. 2009. “Exploratory
factor analysis with small sample sizes.” Multivariate Behavioral
Research 44 (2): 147–181.
Dejonckheere, Egon, Febe Demeyer, Birte Geusens, Maarten Piot, Fran Tuerlinckx, Stijn Verdonck, and Merijn Mestdagh. 2022. “Assessing
the reliability of single-item momentary affective measurements in experience
sampling.” Psychological
Assessment 34 (12): 1138–1154.
Deng, Lifang, and Wai Chan. 2017. “Testing
the difference between reliability coefficients alpha and omega.” Educational and Psychological
Measurement 77 (2): 185–203.
DeVellis, Robert F., and Carolyn T. Thorpe. 2022. Scale
Development: Theory and Applications, 5th
edition. SAGE.
Guadagnoli, Edward, and Wayne F. Velicer. 1988. “Relation
of sample size to the stability of component patterns.” Psychological
Bulletin 103 (2): 265–275.
Guillemin, Francis, Claire Bombardier, and Dorcas Beaton. 1993. “Cross-cultural
adaptation of health-related quality of life measures: Literature review and proposed
guidelines.” Journal of Clinical
Epidemiology 46 (12): 1417–1432.
Hair, Joseph F., Barry J. Babin, Rolph E. Anderson, and William C. Black. 2018. Multivariate
Data Analysis, 8th edition. Cengage.
Hambleton, Ronald K. 2004. “Issues, designs, and technical
guidelines for adapting tests into multiple languages and
cultures.” In Adapting Educational and Psychological Tests for
Cross-cultural Assessment, edited by Ronald K. Hambleton, Peter F. Merenda, and Charles D. Spielberger, 3–38. New York: Psychology Press.
Hambleton, Ronald K., Peter F. Merenda, and Charles D. Spielberger, eds. 2005. Adapting
Educational and Psychological Tests for Cross-cultural Assessment. New York: Psychology Press.
Hanson, Thomas A. 2025. “Interpreting and
psychometrics.” In The Routledge Handbook of Interpreting and
Cognition, edited by Christopher D. Mellinger, 151–169. New York: Routledge.
Harkness, Janet A., Fons J. R. van di Vijver, and Peter Ph. Mohler. 2002. Cross-cultural
Survey Methods. Wiley.
Harkness, Janet A., Ana Villar, and Brad Edwards. 2010. “Translation,
adaptation, and design.” In Survey Methods in Multinational,
Multiregional, and Multicultural Contexts, edited by Janet A. Harkness, Michael Braun, Brad Edwards, Timothy P. Johnson, Lars E. Lyberg, Peter Ph. Mohler, Beth-Ellen Pennell, and Tom W. Smith, 115–140. Hoboken, NJ: Wiley.
Harkness, Janet A., and Alicia Schoua-Glusberg. 1998. “Questionnaires
in translation.” In Cross-cultural
Equivalence, edited by Janet Harkness, 87–126. Mannheim: Zuma.
Hernández, Ana, María Dolores Hidalgo, Ronald K. Hambleton, and Juana Gómez-Benito. 2020. “International
Test Commission guidelines for test adaptation: A criterion
checklist.” Psicothema 32 (3): 390–398.
Horn, John L. 1965. “A rationale and test for the
number of factors in factor
analysis.” Psychometrika 301: 179–185.
Howard, Matt C. 2015. “A review of exploratory factor
analysis (EFA) decisions and overview of current practices: What we are doing and how can we
improve?” International Journal of Human-Computer
Interaction 32 (1): 51–62.
Hubscher-Davidson, Séverine. 2018. Translation
and Emotion: A Psychological
Perspective. London: Routledge.
Hui, C. Harry, and Harry C. Triandis. 1989. “Effects
of culture and response format on extreme response style.” Journal of Cross-Cultural
Psychology 20 (3): 296–309.
International Test
Commission. 2018. “ITC guidelines for translating and adapting tests
(second edition).” International Journal of
Testing 18 (2): 101–134.
Jones, Lyle V., and David Thissen. 2007. “A
history and overview of psychometrics.” In Handbook of Statistics,
vol. 26, edited by C. R. Rao and S. Sinharay, 1–27. Elsevier.
Jungner, Johanna G., Elisabet Tiselius, Klas Blomgren, Kim Lützén, and Pernilla Pergert. 2019. “Language
barriers and the use of professional interpreters: A national multisite cross-sectional survey in pediatric oncology
care.” Acta
Oncologica 58 (7): 1015–1020.
Jungner, Johanna G., Elisabet Tiselius, and Pernilla Pergert. 2021. “Reasons
for not using interpreters to secure patient-safe communication: A national cross-sectional study in paediatric
oncology.” Patient Education and
Counseling 104 (8): 1985–1992.
Keller, Susan D., et al. 1998. “Testing the
equivalence of translations of widely used response choice labels: Results from the IQOLA
project.” Journal of Clinical
Epidemiology 51 (11): 933–944.
Knight, Craig. 2017. “The
history of psychometrics.” In Psychometric Testing: Critical
Perspectives, edited by Barry Cripps, 3–14. Malden, MA: Wiley.
Kukkonen, Karin, and Marco Caracciolo. 2014. “Introduction:
What is the ‘second
generation’?” Style 48 (3): 261–274.
LaBar, Kevin S., and Roberto Cabeza. 2006. “Cognitive
neuroscience of emotional memory.” Nature Reviews
Neuroscience 7 (1): 54–64.
Lakoff, George, and Mark Johnson. 1999. Philosophy
in the Flesh: The Embodied Mind and its Challenge to Western Thought. New York: Basic Books.
Lee, Jerry W., Patricia S. Jones, Yoshimitsu Mineyama, and Xinwei Esther Zhang. 2002. “Cultural
differences in responses to a likert scale.” Research in Nursing &
Health 25 (4): 295–306.
Lehr, Caroline, and Kristian T. Hvelplund. 2020. “Emotional
experts: Influences of emotion on the allocation of cognitive resources during
translation.” In Multilingual Mediated Communication and
Cognition, edited by Ricardo Muñoz Martín and Sandra L. Halverson, 44–68. London: Routledge.
Leppink, Jimme, Fred Paas, Tamara van Gog, Cees P. M. van der Vleuten, Jeroen J. G. van Merriënboer. 2014. “Effects
of pairs of problems and examples on task performance and different types of cognitive
load.” Learning and
Instruction 301: 32–42.
Lim, Sangdon, and Seungmin Jahng. 2019. “Determining
the number of factors using parallel analysis and its recent variance.” Psychological
Methods 24 (4): 452–467.
Mailliez, Mélody, Mark D. Griffiths, and Arnaud Carre. 2022. “Validation
of the French version of the fear of COVID-19 scale and its associations with depression, anxiety, and differential
emotions.” International Journal of Mental Health and
Addiction 201: 2057–2071.
McDonald, Roderick P. 1999. Test Theory: A Unified
Treatment. Lawrence Erlbaum.
McNeish, Daniel. 2018. “Thanks
coefficient alpha, we’ll take it from here.” Psychological
Methods 23 (3): 412–433.
Meade, Adam W., Gary J. Lautenschlager, and Emily C. Johnson. 2007. “A
Monte Carlo examination of the sensitivity of the differential functioning of items and tests framework for tests of
measurement invariance with Likert data.” Applied Psychological
Measurement 31 (5): 430–455.
Mellinger, Christopher D., and Thomas A. Hanson. 2017. Quantitative
Research Methods in Translation and Interpreting Studies. New York: Routledge.
Mellinger, Christopher D., and Thomas A. Hanson. 2020. “Methodological
considerations for survey research: Validity, reliability, and quantitative
analysis.” Linguistica Antverpiensia, New Series–Themes in Translation
Studies 191: 172–190.
Mellinger, Christopher D., and Thomas A. Hanson. 2022. “Latent
variables in translation and interpreting studies: Ontology, epistemology, and
methodology.” In Contesting Epistemologies in Cognitive Translation
and Interpreting Studies, edited by Sandra L. Halverson and Álvaro Marín García, 104–128. New York: Routledge.
Mohammed, Saif M., Mohammad Salameh, and Svetlana Kiritchenko. 2016. “How
translation alters sentiment.” Journal of Artificial Intelligence
Research 551: 95–130.
Muñoz Martín, Ricardo, and Celia Martín de León. 2020. “Translation
and cognitive science.” In The Routledge Handbook of Translation and
Cognition, edited by Fabio Alves, and Arnt L. Jakobsen, 52–68. New York: Routledge.
Partala, Timo, and Veikko Surakka. 2003. “Pupil
size variation as an indication of affective processing.” International Journal of
Human-Computer
Studies 59 (1–2): 185–198.
Peterson, Robert A. 2000. “A meta-analysis of variance
accounted for and factor loadings in exploratory factor analysis.” Marketing
Letters 11 (3): 261–275.
Prior, Anat, Judith F. Kroll, and Brian Macwhinney. 2013. “Translation
ambiguity but not word class predicts translation
performance.” Bilingualism: Language and Cognition 16 (2): 458–474.
R Core Team. 2023. R: A language and
environment for statistical computing. R Foundation for Statistical Computing.
Revelle, William. 2023. psych:
Procedures for psychological, psychometric, and personality research. Northwestern University. R package version 2.3.3.
Rojo López, Ana María. 2017. “The role of
creativity.” In Handbook of Translation and
Cognition, edited by John W. Schwieter and Aline Ferreira, 350–368. Malden, MA: Wiley.
Rojo López, Ana María, and Catherine L. Caldwell-Harris. 2023. “Emotions
in cognitive translation and interpreting studies.” In The Routledge
Handbook of Translation, Interpreting, and Bilingualism, edited by Aline Ferreira, and John W. Schwieter, 206–221. London: Routledge.
Rojo López, Ana María, and Paweł Korpal. 2020. “Through
your skin to your heart and brain: A critical evaluation of physiological methods in cognitive translation and interpreting
studies.” Linguistica Antverpiensia, New Series: Themes in Translation
Studies 191: 191–217.
Rojo López, Ana María, and Beatriz Naranjo. 2021. “Translating
in times of crisis: A study about the emotional effects of the COVID19 pandemic on the translation of evaluative
language.” Journal of
Pragmatics 1761: 29–40.
Saccenti, Edoardo, Margriet H. W. B. Hendriks, and Age K. Smilde. 2020. “Corruption
of the Pearson correlation coefficient by measurement error and its estimation, bias, and correction under different error
models.” Scientific Reports 101: art.
438.
Sarstedt, Marko, and Petra Wilczynski. 2009. “More
for less? A comparison of single-item and multi-item measures.” Die
Betriebswirtschaft 69 (2): 211–227.
Sha, Mandy, and Stephen Immerwahr. 2018. “Survey
translation: Why and how should researchers and managers be engaged?” Survey
Practice 11 (2).
Sharpe, Donald. 2013. “Why
the resistance to statistical innovations? Bridging the communication gap.” Psychological
Methods 18 (4): 572–582.
Shih, Claire Y. 2024. “Affect and emotion in translation process research.” In Translation and Interpreting as Social Interaction: Affect, Behavior, and Cognition, edited by Claire Y. Shih and Caiwen Weng, 1–18. London: Bloomsbury.
Sireci, Stephen G., Yongwei Yang, James Harter, and Eldin J. Ehrlich. 2006. “Evaluating
guidelines for test adaptations: A methodological analysis of translation quality.” Journal of
Cross-Cultural
Psychology 37 (5): 557–567.
Spearman, C. 1904a. “‘General
intelligence,’ objective determined and measured.” American Journal of
Psychology 15 (2): 201–292.
Spearman, C. 1904b. “The
proof and measurement of association between two things.” American Journal of
Psychology 15 (1): 72–101.
Statman, Meir. 2017. Finance
for Normal People: How Investors and Markets
Behave. Oxford: Oxford University Press.
Stevens, James P. 1992. Applied Multivariate Statistics for the
Social Sciences, 2nd edition. Hillsdale, NJ: Lawrence Erlbaum.
Sun, Sanjun. 2016. “Survey-based
studies.” In Researching Translation and
Interpreting, edited by Claudia V. Angelelli, and Brian J. Baer, 269–279. London: Routledge.
Tabachnick, Barbara G., and Linda S. Fidell. 2007. Using
Multivariate Statistics, 5th edition. Pearson.
Trafimow, David. 2016. “The
attenuation of correlation coefficients: A statistical literacy issue.” Teaching
Statistics 38 (1): 25–28.
Tyng, Chai M., Hafeez U. Amin, Mohamad N. M. Saad, and Aamir S. Malik. 2017. “The
influences of emotion on learning and memory.” Frontiers in
Psychology 81: 235933.
van Widenfelt, Brigit M., Philip D. A. Treffers, Els de Beurs, Bart M. Siebelink, and Els Koudijs. 2005. “Translation
and cross-cultural adaptation of assessment instruments used in psychological research with children and
families.” Clinical Child and Family Psychology
Review 8 (2): 135–147.
Venables, W. N., and B. D. Ripley. 2002. Modern
Applied Statistics with S, 4th edition. New York: Springer.
Wanous, John P., Arnon E. Reichers, and Michael J. Hudy. 1997. “Overall
job satisfaction: How good are the single item measures?” Journal of Applied
Psychology 82 (2): 247–252.
Watson, David, Lee Anna Clark, and Auke Tellegen. 1988. “Development
and validation of brief measures of positive and negative affect: The PANAS scales.” Journal of
Personality and Social
Psychology 54 (6): 1063–1070.
Willits, Fern K., Gene L. Theodori, and A. E. Luloff. 2016. “Another
look at Likert scales.” Journal of Rural Social
Sciences 31 (3): 126–139.
Zeman, Janice, Kimberly Shipman, and Susan Penza-Clyve. 2001. “Development
and initial validation of the children’s sadness management scale.” Journal of Nonverbal
Behavior 25 (3): 187–205.
Zumbo, Bruno D., and Charles O. Ochieng. 2002, April. “The
effects of various configurations of Likert, ordered categorical, or rating scale data on the ordinal logistic regression
pseudo R-squared measure of fit: The case of the cumulative logit model.” [Conference
presentation]. American Educational Research Association, New
Orleans, LA, United States. [URL]
Cited by (1)
Cited by one other publication
Rojo López, Ana María & Katarzyna Anna Nowak
2024.
When English isn’t enough in advertising: the role of language, ad length, and complexity in consumer attitudes across Spain and Poland.
Multilingua
This list is based on CrossRef data as of 30 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.