O’Donnell et al. (2013) considered four measures of formulaicity and reported that they produced different results concerning the effects of expertise and first/second language status on formulaic sequence usage in academic writing. The current study explores several additional methodological issues using the same dataset from O’Donnell et al. (2013). We first motivate the need for criterial consistency and investigate whether frequency- and association-based measures yield different results when they are both obtained using corpus-internal criteria. The informativeness of the diversity dimension of formulaic sequence use is then gauged by comparing the results of phrase-frame type-token ratio against those of other measures. Finally, we profile formulaic sequence distribution across quartiles of different measures to assess the effect of variable measure thresholds. Our findings highlight the criticality of issues of criterial consistency, formulaic sequence diversity, and threshold variation in formulaic language research.
Biber, D., Conrad, S., & Cortes, V. (2004). If you look at …: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25(3), 371–405.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman Grammar of Spoken and Written English. New York/London: Longman.
Conklin, K., & Schmitt, N. (2012). The processing of formulaic language. Annual Review of Applied Linguistics, 321, 45–61.
Cortes, V. (2004). Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for Specific Purposes, 23(4), 397–423.
Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213–238.
Durrant, P., & Doherty, A. (2010). Are high-frequency collocations psychologically real? Investigating the thesis of collocational priming. Corpus Linguistics and Linguistic Theory, 6(2), 125–155.
Ellis, N. C. (2012). Formulaic language and Second Language Acquisition: Zipf and the phrasal Teddy Bear. Annual Review of Applied Linguistics, 321, 17–44.
Eskildsen, S. W. (2009). Constructing another language – Usage-based linguistics in second language acquisition. Applied Linguistics, 30(3), 335–357.
Eskildsen, S. W., & Cadierno, T. (2007). Are recurring multi-word expressions really syntactic freezes? Second language acquisition from the perspective of usage-based linguistics. In M. Nenonen & S. Niemi (Eds.), Collocations and Idioms 1: Papers from the First Nordic Conference on Syntactic Freezes (pp. 86–99). Joensuu: Joensuu University Press.
Evert, S. (2008). Corpora and collocations. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics. An International Handbook (pp. 1212–1248). Berlin: Mouton de Gruyter.
Fletcher, W. H. (2007). KfNgram [Computer software]. Annapolis, MD: USNA.
Granger, S. (1996). From CA to CIA and back: An integrated approach to computerized bilingual and learner corpora. In K. Aijmer, B. Altenberg & M. Johansson (Eds.), Languages in Contrast: Paper from a Symposium on Text-based Cross-linguistic Studies (pp. 37–51). Lund: Lund University Press.
Granger, S. (2003). The International Corpus of Learner English: A new resource for foreign language learning and teaching and second language acquisition research. TESOL Quarterly, 37(3), 538–546.
Herbst, T. (2011). Choosing sandy beaches – collocations, probabemes and the idiom principle. In T. Herbst, S. Faulhaber & P. Uhrig (Eds.), The Phraseological View of Language (pp. 27–57). Berlin: Walter de Gruyter.
Hyland, K. (2012). Bundles in academic discourse. Annual Review of Applied Linguistics, 321, 150–169.
Laufer, B., & Nation, P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16(3), 307–322.
Lieven, E., & Tomasello, M. (2008). Children’s first language acquisition from a usage-based perspective. In P. Robinson & N. C. Ellis (Eds.), Handbook on Cognitive Linguistics and Second Language Acquisition (pp. 168–196). New York, NY: Routledge.
McEnery, T., & Hardy, A. (2014). Corpus Linguistics: Method, Theory and Practice. Cambridge: Cambridge University Press.
Manning, C., & Schütze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.
McEnery, T., & Wilson, A. (2004). Corpus Linguistics: An Introduction. Edinburgh: Edinburgh University Press.
Mel’čuk, I. (1998). Collocations and lexical functions. In A. P. Cowie (Ed.), Phraseology: Theory, Analysis, and Applications (pp. 23–53). Oxford: Clarendon Press.
Paquot, M. B., & Granger, S. (2012). Formulaic language in learner corpora. Annual review of Applied Linguistics, 321, 130–149.
Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In J. Richards & R. Schmidt (Eds.), Language and Communication (pp. 191–225). New York/London: Longman.
Pivovarova, L., Kormacheva, D., & Kopotev, M. (2017). Evaluation of collocation extraction methods for the Russian language. In M. Kopotev, O. Lyashevskaja & A. Mustajoki (Eds.), Quantitative Approaches to the Russian Language. New York, NY: Routledge.
Römer, U., & O’Donnell, M. B. (2011). From student hard drive to web corpus (part 1): The design, compilation and genre classification of the Michigan Corpus of Upper-level Student Papers (MICUSP). Corpora, 6(2), 159–177.
Simpson-Vlach, R., & Ellis, N. C. (2010). An academic formulas list: New methods in phraseology research. Applied Linguistics, 31(4), 487–512.
Sinclair, J. (1991). Corpus Concordance Collocation. Oxford: Oxford University Press.
Tomasello, M. (2003). Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press.
Upton, G., & Cook, I. (1996). Understanding Statistics. Oxford: Oxford University Press.
Wood, D. (2015). Fundamentals of Formulaic Language: An Introduction. London: Bloomsbury Publishing.
Wray, A. (2002). Formulaic Language and the Lexicon. Cambridge: Cambridge University Press.
Wray, A. (2008). Formulaic Language: Pushing the Boundaries. Oxford: Oxford University Presss.
Cited by (5)
Cited by five other publications
Lang, Juanjuan
2024. Research on Language Evolution and Language Diversity Based on Chinese Speech Pitch Deviation Features. Applied Mathematics and Nonlinear Sciences 9:1
Samraj, Betty
2024. Disciplinary differences in lexical bundles use: A cautionary tale from methodological variations. Journal of English for Academic Purposes 70 ► pp. 101399 ff.
Szudarski, Paweł
2023. Collocations, Corpora and Language Learning,
Lu, Xiaofei & Renfen Hu
2021. Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behavior Research Methods 54:3 ► pp. 1444 ff.
This list is based on CrossRef data as of 18 november 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.