In research on L2 English, recent corpus-based studies indicate that some non-standard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect innovations, test it on verb + preposition structures (including phrasal verbs) and adjective + preposition structures, and describe similarities and differences between EFL and ESL. We use a dependency-parsed version of the International Corpus of Learner English to automatically extract potential innovations, defined as patterns of overuse compared to the British National Corpus as reference corpus. We measure overuse by means of collocation measures like O/E or T-score, and compare our results with similar results for ESL. In both quantitative and qualitative analyses, we detect similarities between the two varieties (e.g. discuss about) and dissimilarities (e.g. accuse for, only distinctive for EFL). We report more verb/adjective + preposition combinations than previous studies and discuss the roles of analogy and transfer.
Bybee, J. 2007. Frequency of Use and the Organization of Language. Oxford: Oxford University Press.
Cornell, A. 1985. “Realistic goals in teaching and learning phrasal verbs”, International Review of Applied Linguistics in Language Teaching (IRAL) 23(4), 269–280.
Dickinson, M. & Ragheb, M. 2009. “Dependency annotation for learner corpora”. In Proceedings of the
Eighth Workshop on Treebanks and Linguistic Theories (TLT)
. Milan, Italy.
Edwards, A. 2014. “The EFL-ESL continuum and the case of the Netherlands: A comparative analysis of the progressive aspect”, World Englishes 331, 173–194.
Evert, S. 2008. “Corpora and collocations”. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics. An International Handbook. Berlin: de Gruyter, 1212–1248.
Fuchs, R. & Wunder, E.-M. 2015. “A sonority-based account of speech rhythm in Chinese learners of English”. In U. Gut, R. Fuchs & E.-M. Wunder (Eds.), Universal or Diverse Paths to English Phonology? Bridging the Gap between Research on Phonological Acquisition of English as a Second, Third or Foreign Language. Berlin: de Gruyter, 165–184.
Gardner, D. & Davies, M. 2007. “Pointing out frequent phrasal verbs: A corpus-based analysis”, TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect 41(2), 339–359.
Gilquin, G. 2011. “Corpus linguistics to bridge the gap between World Englishes and Learner Englishes”. In L. Ruiz Miyares & M.R. Álvarez Silva (Eds.), Comunicación Social en el Siglo XXI, Vol. II1. Santiago de Cuba: Centro de Lingüística Aplicada, 638–642.
Gilquin, G. 2015b. “The use of phrasal verbs by French-speaking EFL learners. A constructional and collostructional corpus-based approach”, Corpus Linguistics and Linguistic Theory 11(1), 51–88.
Gilquin, G. To appear. “Applied cognitive linguistics and second/foreign language varieties: Towards an explanatory account”. In E. Tribushinina, J. Evers-Vermeul & L. Rasier (Eds.), Usage-based Approaches to Language Acquisition and Language Teaching. Berlin: de Gruyter.
Götz, S. 2015. “Fluency in ENL, ESL and EFL: A corpus-based pilot study”. In Proceedings of Disfluency in Spontaneous Speech, DISS 2015. Glasgow, UK. Available at: [URL] (accessed April 2016).
Granger, S. 2009. “Prefabricated patterns in advanced EFL writing: Collocations and formulae”. In A.P. Cowie (Ed.), Phraseology: Theory, Analysis, and Applications. Oxford: Oxford University Press, 185–204.
Granger, S., Dagneaux, E., Meunier, F. & Paquot, M. 2009. International Corpus of Learner English. Version 2 (Handbook + CD-ROM). Louvain-la-Neuve: Presses universitaires de Louvain.
Gut, U. 2011. “Studying structural innovations in New English varieties”. In J. Mukherjee & M. Hundt (Eds.), Exploring Second-Language Varieties of English and Learner Englishes: Bridging a Paradigm Gap. Amsterdam: John Benjamins, 100–124.
Gut, U., Fuchs, R. & Wunder, E.-M. (Eds.). 2015. Universal or Diverse Paths to English Phonology. Berlin: de Gruyter.
Jurafsky, D. & Martin, J.H. 2009. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (2nd ed.). Upper Saddle River, NJ: Prentice Hall.
Lehmann, H.M. & Schneider, G. 2011. “A large-scale investigation of verb-attached prepositional phrases”. In S. Hoffmann, P. Rayson & G. Leech (Eds.), Studies in Variation, Contacts and Change in English, Volume 6: Methodological and Historical Dimensions of Corpus Linguistics. Varieng, Helsinki. Available at: [URL] (accessed April 2016).
Lehmann, H.M. & Schneider, G. 2012. “Dependency Bank”. In Proceedings of
LREC 2012 Workshop on Challenges in the Management of Large Corpora
, 23–28.
Mukherjee, J. 2005. “All mine, mine alone…”. Emerging local norms in Indian English lexico-grammar. Paper presented at the University of Zurich.
Mukherjee, J. 2007. “Steady states in the evolution of New Englishes: Present-day Indian English as an equilibrium”, Journal of English Linguistics 35(2), 157–187.
Ng, H.T., Wu, S.M., Briscoe, T., Hadiwinoto, C., Susanto, R.H. & Bryant, C. (Eds.). 2014. Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task. Association for Computational Linguistics, Baltimore, Maryland, June.
Pecina, P. 2009. Lexical Association Measures: Collocation Extraction. Studies in Computational and Theoretical Linguistics. Institute of Formal and Applied Linguistics, Charles University in Prague.
Rosén, V. & Smedt, K.D. 2010. “Syntactic annotation of learner corpora”. In H. Johansen, A. Golden, J.E. Hagen & A.-K. Helland (Eds.), Systematisk, variert, men ikke tilfeldig. Antologi om norsk som andrespråk i anledning Kari Tenfjords 60-årsdag [Systematic, Varied, but not Arbitrary. Anthology about Norwegian as a Second Language on the Occasion of Kari Tenfjord’s 60th Birthday]. Oslo: Novus forlag, 120–132.
Sag, I.A., Baldwin, T., Bond, F., Copestake, A. & Flickinger, D. 2001. Multi-word expressions: A pain in the neck for NLP. Technical Report LinGO Working Paper No. 2001-03, Stanford University, CA.
Sand, A. 2004. “Shared morpho-syntactic features in contact varieties of English: Article use”, World Englishes 23(2), 281–98.
Schneider, E.W. 2004. “How to trace structural nativization: Particle verbs in world Englishes”, World Englishes 23(2), 227–249.
Schneider, G. 2008. Hybrid Long-Distance Functional Dependency Parsing. PhD Thesis. Institute of Computational Linguistics, University of Zurich.
Schneider, G. & Hundt, M. 2009. “Using a parser as a heuristic tool for the description of New Englishes”. In Proceedings of
Corpus Linguistics 2009
, Liverpool.
Schneider, G. & Zipp, L. 2013. “Discovering new verb-preposition combinations in New Englishes”, Studies in Variation, Contacts and Change in English 131. Available at: [URL] (accessed April 2016).
Van Rooy, B. 2015. “Annotating learner corpora”. In S. Granger, G. Gilquin & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 79–105.
Wulff, S. 2008. Rethinking Idiomaticity: A Usage-based Approach. London: Continuum.
Cited by (10)
Cited by ten other publications
Callies, Marcus
2023. Errors and Innovations in L2 Varieties of English: Towards Resolving a Contradictory Practice. In Contradiction Studies – Exploring the Field [Contradiction Studies, ], ► pp. 201 ff.
2020. The Interplay between Universal Processes and Cross-Linguistic Influence in the Light of Learner Corpus Data: Examining Shared Features of Non-native Englishes. In Learner Corpus Research Meets Second Language Acquisition, ► pp. 67 ff.
Schneider, Gerold, Marianne Hundt & Daniel Schreier
2020. Pluralized non-count nouns across Englishes: A corpus-linguistic approach to variety types. Corpus Linguistics and Linguistic Theory 16:3 ► pp. 515 ff.
McCallum, Lee
2019. Assessing Second Language Proficiency Under ‘Unequal’ Perspectives: A Call for Research in the MENA Region. In English Language Teaching Research in the Middle East and North Africa, ► pp. 3 ff.
This list is based on CrossRef data as of 19 november 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.