Semantic embedding approaches commonly used in natural language processing such as transformer models have rarely
been used to examine L2 lexical knowledge. Importantly, their performance has not been contrasted with more traditional annotation
approaches to lexical knowledge. This study used NLP techniques related to lexical annotations and semantic embedding approaches
to model the receptive vocabulary of L2 learners based on their lexical production during a writing task. The goal of the study is
to examine the strengths and weaknesses of both approaches in understanding L2 lexical knowledge. Findings indicate that
transformer approaches based on semantic embeddings outperform linguistic annotations and Word2vec models in predicting L2
learners’ vocabulary scores. The findings help to support the strength and accuracy of semantic-embedding approaches as well as
their generalizability across tasks when compared to linguistic feature models. Limitations to semantic-embedding approaches,
especially interpretability, are discussed.
Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., & Herrera, F. (2020). Explainable
Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible
AI. Information
Fusion, 581, 82–115.
Balota, D. A., Cortese, M. J., Sergent-Marshall, S. D., Spieler, D. H., & Yap, M. (2004). Visual
word recognition of single-syllable words. Journal of Experimental Psychology.
General,
133
(2), 283–316.
Balota, D. A., Yap, M. J., Cortese, M. J., Hutchison, K. A., Kessler, B., Loftis, B., Neely, J. H., Nelson, D. L., Simpson, G. B., & Treiman, R. (2007). The
English Lexicon Project. Behavior Research
Methods, 391, 445–459.
Berger, C., Crossley, S., & Kyle, K. (2019). Using
native-speaker psycholinguistic norms to predict lexical proficiency and development in second-language
production. Applied
Linguistics, 40 (1), 22–42.
Berger, C., Crossley, S., & Skalicky, S. (2019). Using
lexical features to investigate second language lexical decision performance. Studies in Second
Language
Acquisition,
41
(5), 911–935.
Biber, D. (1988). Variation
across Speech and Writing. Cambridge University Press.
Biber, D., Gray, B., & Staples, S. (2016). Predicting
Patterns of Grammatical Complexity Across Language Exam Task Types and Proficiency
Levels. Applied
Linguistics,
37
(5), 639–668.
Bird, S., Klein, E., & Loper, E. (2009). Natural
Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media, Inc.
BNC Consortium, The British National Corpus, XML
Edition, (2007), Oxford Text Archive, [URL]
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching
Word Vectors with Subword
Information. ArXiv:1607.04606 [Cs]. [URL].
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020). Language
Models are Few-Shot Learners. ArXiv:2005.14165 [Cs]. [URL]
Brysbaert, M., & New, B. (2009). Moving
beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved
word frequency measure for American English. Behavior Research
Methods,
41
(4), 977–990.
Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness
ratings for 40 thousand generally known English word lemmas. Behavior Research
Methods,
46
(3), 904–911.
Clark, K., Khandelwal, U., Levy, O., & Manning, C. D. (2019). What
Does BERT Look At? An Analysis of BERT’s Attention
(arXiv:1906.04341). arXiv. [URL]
Conrad, S. (2005). Corpus
Linguistics and L2 Teaching. In Handbook of Research in Second
Language Teaching and Learning. Routledge.
Crossley, S. A., & Kyle, K. (2022). Managing
Second Language Acquisition Data with Natural Language Processing
Tools. In The Open Handbook of Linguistic Data
Management (pp. 411–421). The MIT Press.
Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011a). What
Is Lexical Proficiency? Some Answers from Computational Models of Speech Data. TESOL Quarterly:
A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second
Dialect,
45
(1), 182–193.
Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011b). Predicting
lexical proficiency in language learner texts using computational indices. Language
Testing,
28
(4), 561–580.
Crossley, S. A., & Skalicky, S. (2019). Examining
Lexical Development in Second Language Learners: An Approximate Replication of Salsbury, Crossley &
McNamara (2011). Language
Teaching,
52
(3), 385–405.
Crossley, S. A., Skalicky, S., Kyle, K., & Monteiro, K. (2019). Absolute
frequency effects in second language lexical acquisition. Studies in Second Language
Acquisition,
41
(4), 721–744.
Crossley, S., Salsbury, T., & McNamara, D. (2009). Measuring
L2 lexical growth using hypernymic relationships. Language
Learning,
59
(2), 307–334.
Crossley, S., Salsbury, T., & McNamara, D. (2010). The
Development of Polysemy and Frequency Use in English Second Language Speakers: Polysemy and Frequency Use in English L2
Speakers. Language
Learning,
60
(3), 573–605.
David, A. (2008). Vocabulary
breadth in French L2 learners. The Language Learning
Journal,
36
(2), 167–180.
Davies, M. (2010). The
Corpus of Contemporary American English as the first reliable monitor corpus of
English. Literary and Linguistic
Computing,
25
(4), 447–464.
Devlin, J., Chang, M. -W., Lee, K., & Toutanova, K. (2019). BERT:
Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805
[Cs]. [URL]
Došilović, F. K., Brčić, M., & Hlupić, N. (2018). Explainable
artificial intelligence: A survey. 2018 41st International Convention on Information and
Communication Technology, Electronics and Microelectronics
(MIPRO), 0210–0215.
Ellis, N. C. (2002). Frequency
effects in language processing: A Review with Implications for Theories of Implicit and Explicit Language
Acquisition. Studies in Second Language
Acquisition,
24
(2), 143–188.
Fellbaum, C. (1998). WordNet:
An Electronic Lexical Database. Cambridge, MA: MIT Press.
Garner, J., & Crossley, S. (2018). A
Latent Curve Model Approach to Studying L2 N-Gram Development. The Modern Language
Journal,
102
(3), 494–511.
Garner, J., Crossley, S., & Kyle, K. (2018). Beginning
and intermediate L2 writer’s use of N-grams: An association measures study. International
Review of Applied Linguistics in Language
Teaching,
58
(1), 51–74.
Goldberg, Y. (2019). Assessing
BERT’s Syntactic Abilities (arXiv:1901.05287). arXiv.
Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix:
Analysis of text on cohesion and language. Behavior Research Methods, Instruments, &
Computers,
36
(2), 193–202.
Grant, L., & Ginther, A. (2000). Using
Computer-Tagged Linguistic Features to Describe L2 Writing Differences. Journal of Second
Language
Writing,
9
(2), 123–145.
Gunning, D., & Aha, D. (2019). DARPA’s
Explainable Artificial Intelligence (XAI) Program. AI
Magazine,
40
(2), 44–58.
Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., & Yang, G. -Z. (2019). XAI-Explainable
artificial intelligence. Science
Robotics,
4
(37), eaay7120.
Hashimoto, B. J., & Egbert, J. (2019). More
Than Frequency? Exploring Predictors of Word Difficulty for Second Language Learners. Language
Learning,
69
(4), 839–872.
Huang, Y., Murakami, A., Alexopoulou, T., & Korhonen, A. (2018). Dependency parsing of learner English. International Journal of Corpus Linguistics, 23(1), 28–54.
Ishikawa, S. (2013). The
ICNALE and sophisticated contrastive interlanguage analysis of Asian learners of
English. Learner Corpus Studies in Asia and the
World,
1
1, 91–118.
Ke, Z., & Ng, V. (2019). Automated
Essay Scoring: A Survey of the State of the
Art. 6300–6308.
Kerz, E., Wiechmann, D., Qiao, Y., Tseng, E., & Ströbel, M. (2021). Automated
Classification of Written Proficiency Levels on the CEFR-Scale through Complexity Contours and
RNNs. Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational
Applications, 199–209. [URL]
Kohavi, R., & John, G. H. (1995). Automatic
Parameter Selection by Minimizing Estimated Error. In A. Prieditis & S. Russell. (Eds.), Machine
Learning Proceedings
1995 (pp. 304–312). Morgan Kaufmann.
Koizumi, R., & In’nami, Y. (2013). Vocabulary
Knowledge and Speaking Proficiency among Second Language Learners from Novice to Intermediate
Levels. Journal of Language Teaching and
Research,
4
(5), 900–913.
Kuhn, M. (2008). Building
Predictive Models in R Using the caret Package. Journal of Statistical
Software,
28
1, 1–26.
Kuperman, V., Stadthagen-Gonzalez, H., & Brysbaert, M. (2012). Age-of-acquisition
ratings for 30,000 English words. Behavior Research
Methods,
44
(4), 978–990.
Kyle, K., & Crossley, S. (2016). The
relationship between lexical sophistication and independent and source-based writing. Journal
of Second Language
Writing,
34
1, 12–24.
Kyle, K., & Crossley, S. A. (2015). Automatically
Assessing Lexical Sophistication: Indices, Tools, Findings, and Application. TESOL
Quarterly,
49
(4), 757–786.
Kyle, K., Crossley, S., & Berger, C. (2018). The
Tool for the Automatic Analysis of Lexical Sophistication (TAALES): Version 2.0. Behavior
Research
Methods,
50
(3), 1030–1046.
Landauer, T. K., & Dumais, S. T. (1997). A
solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of
knowledge. Psychological
Review,
104
(2), 211–240.
Landauer, T. K., McNamara, D. S., Dennis, S., & Kintsch, W. (Eds.). (2007). Handbook
of Latent Semantic Analysis. Psychology Press.
Lau, J. H., & Baldwin, T. (2016). An
Empirical Evaluation of doc2vec with Practical Insights into Document Embedding
Generation. Proceedings of the 1st Workshop on Representation Learning for
NLP, 78–86.
Laufer, B., & Nation, P. (1995). Vocabulary
Size and Use: Lexical Richness in L2 Written Production. Applied
Linguistics,
16
(3), 307–322.
Le, Q. V., & Mikolov, T. (2014). Distributed
Representations of Sentences and Documents. ArXiv:1405.4053 [Cs]. [URL]
Lemhöfer, K., Dijkstra, T., Schriefers, H., Baayen, R. H., Grainger, J., & Zwitserlood, P. (2008). Native
language influences on word recognition in a second language: A megastudy. Journal of
Experimental Psychology. Learning, Memory, and
Cognition,
34
(1), 12–31.
Lu, Xiaofei. (2012). The
relationship of lexical richness to the quality of ESL learners’ oral narratives. The Modern
Language
Journal, 96(2), 190–208.
Lu, X., & Hu, R. (2021). Sense-aware
lexical sophistication indices and their relationship to second language writing
quality. Behavior Research Methods.
McDonald, S. A., & Shillcock, R. C. (2001). Rethinking
the Word Frequency Effect: The Neglected Role of Distributional Information in Lexical
Processing. Language and
Speech,
44
(3), 295–322.
Meara, P. (1996). The
dimensions of lexical competence. Performance and Competence in Second Language
Acquisition,
35
1, 33–55.
Meara, P. (2005b). Lexical
frequency profiles: A Monte Carlo analysis. Applied
Linguistics,
26
(1), 32–47.
Meara, P. (2010). The
relationship between L2 vocabulary knowledge and L2 vocabulary use. The Continuum Companion to
Second Language Acquisition, 179–193.
Meurers, D. (2012). Natural
Language Processing and Language Learning. In The Encyclopedia of
Applied Linguistics. John Wiley & Sons, Ltd.
Meurers, D. (2021). Natural
Language Processing and Language Learning. In The Encyclopedia of
Applied Linguistics. John Wiley & Sons, Ltd.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient
Estimation of Word Representations in Vector Space. ArXiv:1301.3781
[Cs].
Milton, J. (2009). Measuring
Second Language Vocabulary Acquisition. In Measuring Second Language
Vocabulary Acquisition. Multilingual Matters.
Moghadam, S. H., Zainal, Z., & Ghaderpour, M. (2012). A
review on the important role of vocabulary knowledge in reading comprehension
performance. Procedia-Social and Behavioral
Sciences, 661, 555–563.
Monteiro, K. R., Crossley, S. A., & Kyle, K. (2020). In
Search of New Benchmarks: Using L2 Lexical Frequency and Contextual Diversity Indices to Assess Second Language
Writing. Applied
Linguistics, 41(2), 280–300.
Morris, L., & Cobb, T. (2004). Vocabulary
profiles as predictors of the academic performance of Teaching English as a Second Language
trainees. System,
32
(1), 75–87.
Mostafa, T., Crossley, S., & Kim, Y. (2021). Predictors
of English as second language learners’ oral proficiency development in a classroom
context. International Journal of Applied
Linguistics, 31 (3), 526–548.
Nagy, W. E., & Scott, J. A. (2000). Vocabulary
processes. In M. L. Kamil, P. Mosenthal, P. D. Pearson, & R. Barr. (Eds.), Handbook
of reading
research (Vol. 31, pp. 269–284). Mahwah, NJ: Earlbaum.
Nation, P., & Beglar, D. (2007). A
vocabulary size test. The Language
Teacher,
31
(7), 9–13.
Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (2004). The
University of South Florida free association, rhyme, and word fragment norms. Behavior Research
Methods, Instruments, &
Computers,
36
(3), 402–407.
Ortega, L. (2016). Multi-competence
in second language acquisition: inroads into the mainstream? In V. Cook & L. Wei. (Eds) The
Cambridge Handbook of Linguistic Multi-competence. Cambridge University Press.
Paetzold, G., & Specia, L. (2016). Collecting
and Exploring Everyday Language for Predicting Psycholinguistic Properties of
Words. Proceedings of COLING 2016, the 26th International Conference on Computational
Linguistics: Technical Papers, 1669–1679. [URL]
R Core Team (2022). R: A language
and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. [URL]
Read, J. (1998). Validating
a Test to Measure Depth of Vocabulary Knowledge. In Validation in
Language Assessment. Routledge.
Řehůřek, R., & Sojka, P. (2010). Software
Framework for Topic Modelling with Large Corpora. Proceedings of the LREC 2010 Workshop on New
Challenges for NLP Frameworks, 45–50.
Saito, K. (2020). Multi-
or Single-Word Units? The Role of Collocation Use in Comprehensible and Contextually Appropriate Second Language
Speech. Language
Learning,
70
(2), 548–588.
Sun, K., & Lu, X. (2021). Assessing
Lexical Psychological Properties in Second Language Production: A Dynamic Semantic Similarity
Approach. Frontiers in
Psychology,
12
1, 672243.
Sundqvist, P. (2019). Commercial-off-the-shelf
games in the digital wild and L2 learner vocabulary. Language
Learning,
23
(1), 27.
Vanderbilt, Katia, “Developing and Testing Alternative Benchmarks of Lexical Sophistication: L2 Lexical Frequency, Semantic Context, and Word Recognition Indices.” Dissertation, Georgia State University, 2020.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention
Is All You Need. ArXiv:1706.03762 [Cs]. [URL]
Webb, S. (2008). Receptive
and productive vocabulary sizes of L2 learners. Studies in Second Language
Acquisition,
30
(1), 79–95.
Webb, S. (2009). The
Effects of Receptive and Productive Learning of Word Pairs on Vocabulary Knowledge. RELC
Journal,
40
(3), 360–376.
Wilson, M. (1988). MRC
psycholinguistic database: Machine-usable dictionary, version 2.00. Behavior Research Methods,
Instruments, &
Computers,
20
(1), 6–10.
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., … Rush, A. (2020). Transformers:
State-of-the-Art Natural Language Processing. Proceedings of the 2020 Conference on Empirical
Methods in Natural Language Processing: System
Demonstrations, 38–45.
Zaytseva, V., Miralpeix, I., & Pérez-Vidal, C. (2019). Because
words matter: Investigating vocabulary development across contexts and modalities. Language
Teaching Research, 136216881985297.
Zhang, H., Chen, M., & Li, X. (2021). Developmental Features of Lexical Richness in English Writings by Chinese Beginner Learners. Frontiers in Psychology.
Zhu, J., Liapis, A., Risi, S., Bidarra, R., & Youngblood, G. M. (2018). Explainable
AI for Designers: A Human-Centered Perspective on Mixed-Initiative Co-Creation. 2018 IEEE
Conference on Computational Intelligence and Games
(CIG), 1–8.
Cited by (4)
Cited by four other publications
Cong, Yan
2024. Second language learning of degree expressions: A computational approach. Natural Language Processing► pp. 1 ff.
Cong, Yan
2025. Demystifying large language models in second language development research. Computer Speech & Language 89 ► pp. 101700 ff.
Lohmann, Julian F., Fynn Junge, Jens Möller, Johanna Fleckenstein, Ruth Trüb, Stefan Keller, Thorben Jansen & Andrea Horbach
2024. Neural Networks or Linguistic Features? - Comparing Different Machine-Learning Approaches for Automated Assessment of Text Quality Traits Among L1- and L2-Learners’ Argumentative Essays. International Journal of Artificial Intelligence in Education
2023. Evaluating Familiarity Ratings of Domain Concepts with Interpretable Machine Learning: A Comparative Study. Applied Sciences 13:23 ► pp. 12818 ff.
This list is based on CrossRef data as of 10 january 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.