Article published In:
International Journal of Corpus Linguistics: Online-First ArticlesAssessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis
The case of apology
Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high
accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping
to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches
in corpus linguistics. To address this, our study explores the possibility of using large language models (LLMs) to automate
pragma-discursive corpus annotation. We compare GPT-3.5 (the model behind the free-to-use version of ChatGPT), GPT-4 (the model
underpinning the precise mode of Bing chatbot), and a human coder in annotating apology components in English based on the local
grammar framework. We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder. These results suggest
that LLMs can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient, scalable,
and accessible.
Keywords: corpus pragmatics, large language models, pragma-discursive corpus annotation, local grammar, ChatGPT
Article outline
- 1.Introduction
- 2.Corpus annotation: Long-standing challenges, new opportunities
- 2.1Challenges in automating pragmatic and discourse-level annotation
- 2.2LLM-assisted corpus annotation
- 3.Data and methods
- 3.1The annotation task
- 3.2Prompt design
- 3.3Performance evaluation
- 4.Results
- 4.1GPT-3.5 versus GPT-4
- 4.2GPT-4 versus a human annotator
- 4.2.1Recognition of no apology
- 4.2.2Recognition of apologising
- 4.2.3Recognition of reason
- 4.2.4Recognition of apologiser
- 4.2.5Recognition of apologisee
- 4.2.6Recognition of intensifier
- 4.3Summary of findings
- 5.Conclusion
- Notes
-
References
Published online: 3 June 2024
https://doi.org/10.1075/ijcl.23087.yu
https://doi.org/10.1075/ijcl.23087.yu
References (48)
Baker, P., Brookes, G., & Evans, C. (2019). The
language of patient feedback: A corpus linguistic study of online health
communication. Routledge.
Blum-Kulka, S., House, J., & Kasper, G. (1989). (Eds.). Cross-cultural
pragmatics: Requests and apologies. Ablex Publishing Corporation.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., … & Amodei, D. (2020). Language
models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.). Advances
in neural information processing systems 33: 34th conference on neural information processing
systems (pp. 1877–1901). Neural Information Processing Systems Foundation, Inc.
Cavasso, L., & Taboada, M. (2021). A
corpus analysis of online news comments using the Appraisal framework. Journal of Corpora and
Discourse Studies, (4), 1–38.
Cheng, W., & Ching, T. (2018). ‘Not
a guarantee of future performance’: The local grammar of disclaimers. Applied
Linguistics,
39
(3), 263–301.
Ding, B., Qin, C., Liu, L., Chia, Y. K., Joty, S., Li, B., & Bing, L. (2023). Is
GPT-3 a good data annotator? arXiv.
Frei, J., & Kramer, F. (2023). Annotated
dataset creation through large language models for non-English medical NLP. Journal of
Biomedical Informatics, (145).
Fuoli, M., & Hommerberg, C. (2015). Optimising
transparency, reliability and replicability: Annotation principles and inter-coder agreement in the quantification of
evaluative
expressions. Corpora,
10
(3), 315–349.
Fuoli, M., Littlemore, J., & Turner, S. (2022). Sunken
ships and screaming banshees: Metaphor and evaluation in film reviews. English Language &
Linguistics,
26
(1), 75–103.
Garside, R., Leech, G., & McEnery, T. (1997). Corpus
annotation: Linguistic information from computer text
corpora. Routledge.
Garside, R., & Smith, N. (1997). A
hybrid grammatical tagger: CLAWS4. In R. Garside, G. Leech, & T. McEnery (Eds.), Corpus
annotation: Linguistic information from computer text
corpora (pp. 102–121). Routledge.
Gilardi, F., Alizadeh, M., & Kubli, M. (2023). ChatGPT
outperforms crowd-workers for text-annotation tasks. arXiv.
He, X., Lin, Z., Gong, Y., Jin, A., Zhang, H., Lin, C., Jiao, J., Yiu, S. M., Duan, N., & Chen, W. (2023). AnnoLLM:
Making large language models to be better crowdsourced
annotators. arXiv.
Hunston, S. (2002). Pattern
grammar, language teaching, and linguistic variation: Applications of a corpus-driven
grammar. In R. Reppen, S. Fitzmaurice, & D. Biber (Eds.), Using
corpora to explore linguistic
variation (pp. 167–183). John Benjamins.
Hunston, S., & Sinclair, J. (2001). A
local grammar of evaluation. In S. Hunston & G. Thompson (Eds.), Evaluation
in text: Authorial stance and the construction of discourse. Oxford University Press.
Hunston, S., & Su, H. (2019). Patterns,
constructions, and local grammar: A case study of ‘evaluation.’ Applied
Linguistics,
40
(4), 567–593.
Kirk, J. M. (2016). The
pragmatic annotation scheme of the SPICE-Ireland corpus. International Journal of Corpus
Linguistics,
21
(3), 299–322.
Kolhatkar, V., Wu, H., Cavasso, L., Francis, E., Shukla, K., & Taboada, M. (2020). The
SFU opinion and comments corpus: A corpus for the analysis of online news comments. Corpus
Pragmatics, (4), 155–190.
(1997). Introducing
corpus annotation. In R. Garside, G. Leech, & T. McEnery (Eds.), Corpus
annotation: Linguistic information from computer text
corpora (pp. 1–18) Routledge.
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2023). Pre-train,
prompt, and predict: A systematic survey of prompting methods in natural language
Processing. ACM Computing
Surveys,
55
(9), 1–35.
Love, R., Dembry, C., Hardie, A., Brezina, V., & McEnery, T. (2017). The
Spoken BNC2014: Designing and building a spoken corpus of everyday conversations. International
Journal of Corpus
Linguistics,
22
(3), 319–344.
Lutzky, U., & Kehoe, A. (2017a). “Oops,
I didn’t mean to be so flippant”. A corpus pragmatic analysis of apologies in blog
data. Journal of
Pragmatics, (116), 27–36.
(2017b). “I
apologise for my poor blogging”: Searching for apologies in the Birmingham Blog Corpus. Corpus
Pragmatics, (1), 37–56.
Martin, J. R., & White, P. R. R. (2005). The
language of evaluation: Appraisal in English. Palgrave Macmillan.
Microsoft & OpenAI. (2023). Bing Chat (Apr-11-28-2023 version). [GPT-4 language model]. [URL]
Milà-Garcia, A. (2018). Pragmatic
annotation for a multi-layered analysis of speech acts: A methodological proposal. Corpus
Pragmatics, (2), 265–287.
O’Keeffe, A. (2018). “Corpus-based
function-to-form approaches”. In A. H. Jucker, K. P. Schneider & W. Bublitz (Eds.), Methods
in
pragmatics (pp. 587–618). Mouton de Gruyter.
OpenAI. (2023). ChatGPT (Apr 11-28-2023 version). [Large language
model]. [URL]
Page, R. (2014). Saying
‘sorry’: Corporate apologies posted on Twitter. Journal of
Pragmatics, (62), 30–45.
Põldvere, N., De Felice, R., & Paradis, C. (2022). Advice
in conversation: Corpus pragmatics meets mixed methods. Cambridge University
Press.
Rayson, P., Archer, D., Piao, S., & McEnery, T. (2004). The
UCREL semantic analysis system. In Proceedings of the Workshop on
Beyond Named Entity Recognition: Semantic Labelling for NLP Tasks in Association with the LREC
2004 (pp. 7–12).
Rühlemann, C., & Aijmer, K. (2014). Corpus
pragmatics: Laying the foundations. In Corpus pragmatics: A
handbook (pp. 1–26). Cambridge University Press.
Simaki, V., Paradis, C., Skeppstedt, M., Sahlgren, M., Kucher, K., & Kerren, A. (2020). Annotating
speaker stance in discourse: The Brexit Blog Corpus. Corpus Linguistics and Linguistic
Theory,
16
(2), 215–248.
Su, H. (2017). Local
grammars of speech acts: An exploratory study. Journal of
Pragmatics, (
111
), 72–83.
(2021). Changing
patterns of apology in spoken British English: A local grammar based diachronic
investigation. Pragmatics and
Society,
12
(3), 410–436.
Su, H., & Wei, N. (2018). “I’m
really sorry about what I said”: A local grammar of
apology. Pragmatics,
28
(3), 439–462.
Su, H., & Zhang, L. (2020). Local
grammars and discourse acts in academic writing: A case study of exemplification in Linguistics research
articles. Journal of English for Academic
Purposes, (
43
), Article 100805.
Taylor, C. (2016). Mock
politeness in English and Italian. John Benjamins.
Wei, X., Cui, X., Cheng, N., Wang, X., Zhang, X., Huang, S., Xie, P., Xu, J., Chen, Y., Zhang, M., Jiang, Y., & Han, W. (2023). Zero-shot
information extraction via chatting with ChatGPT. arXiv.
Weisser, M. (2014). Speech
act annotation. In K. Aijmer & C. Rühlemann (Eds.), Corpus
pragmatics: A
handbook (pp. 84–110). Cambridge University Press.
(2016). DART –
The dialogue annotation and research tool. Corpus Linguistics and Linguistic
Theory,
12
(2), 355–388.
Yang, J., Jin, H., Tang, R., Han, X., Feng, Q., Jiang, H., Yin, B., & Hu, X. (2023). Harnessing
the power of LLMs in practice: A survey on ChatGPT and beyond. arXiv.
Cited by (3)
Cited by three other publications
Hara, Kotaro, Rosiana Natalie, Wei Soon Cheong, Jingjing Gu & Qianli Xu
Yu, Danni, Marina Bondi & Ken Hyland
This list is based on CrossRef data as of 19 november 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.