Annotating dialogue acts in speech data
Problematic issues and basic dialogue act categories
The aims of this paper are to detect the most problematic issues related to dialogue act annotation in speech
corpora and to define basic categories of dialogue acts. I critically examine and test generic schemes that represent different
lines of dialogue act annotation: AMI, DART, ISO 24617–2 and SWBD-DAMSL. It is found that the most problematic issues regarding
dialogue act annotation are related to the distinction between the semantic and pragmatic meanings of utterances, the annotation
of metadiscourse, and the adequacy and informativeness of the tagset. The identified basic dialogue act categories are information
providing, information seeking, actions, social acts and metadiscourse. The findings help improve dialogue act annotation.
Article outline
- 1.Introduction
- 2.Dialogue act annotation schemes
- 3.Methodology
- 3.1Selecting dialogue act annotation schemes
- 3.2Test data
- 3.3Annotation process
- 3.4Analytical procedure
- 4.Dialogue act annotation
- 4.1Applicability to a new language
- 4.2Utterance meaning
- 4.3Ambiguity
- 4.3.1Basic unit
- 4.3.2Tags
- 4.4Adequacy
- 4.5Informativeness
- 5.Dialogue act categories
- 5.1Information-providing acts
- 5.2Information-seeking acts
- 5.3Action acts
- 5.4Social acts
- 5.5Metadiscourse acts
- 6.Conclusions
-
References
References (46)
References
Alexandersson, J., Buschbeck-Wolf, B., Fujinami, T., Maier, E., Reithinger, N., Schmitz, B., & Siegel, M. (1997). Dialogue
Acts in VERBMOBIL-2. Report 204. DFKI GmbH, Saarbrücken, Germany. [URL]
Allen, J. F., Schubet, L. K., Ferguson, G., Heeman, P., Hwang, C. H., Kato, T., Light, M., Martin, N. G., Miller, B. W., Poesio, M., & Traum, D. R. (1994). The
TRAINS project: A Case Study in Building Conversational Planning Agent. TRAINS technical note
94–3. The University of Rochester. [URL]
Allen, J., & Core, M. (1997). Draft
of DAMSL: Dialog Act Markup in Several Layers. [URL]
AMI. (2005). Guidelines for Dialogue Act
and Addressee Annotation Version 1.0. [URL]
Austin, J. L. (1975). How
to Do Things with Words (2nd ed.). Oxford University Press.
Barras, C., Geoffrois, E., Wu, Z., & Liberman, M. (2000). Transcriber:
Development and use of a tool for assisting speech corpora production. Speech
Communication, 33(1–2), 5–22.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman
Grammar of Spoken and Written English. Longman.
Bunt, H. (1994). Context
and Dialogue Control. Think
Quarterly, 31, 19–34.
Bunt, H. (2009). The
DIT++ taxonomy for functional dialogue markup. In D. Heylen, C. Pelachaud, R. Catizone, & D. Traum. AMAAS
2009 Workshop ‘Towards a Standard Markup Language for Embodied Dialogue Acts’
Proceedings (pp. 13–23). Budapest. [URL]
Bunt, H. (2019). Guidelines
for Using ISO Standard 24617-2. [URL]
Bunt, H. C., & Black, B. (2000). The
ABC of computational pragmatics. In H. C. Bunt & W. Black. (Eds.), Computational
Pragmatics: Abduction, Belief and Context. John Benjamins.
Clark, A., & Popescu-Belis, A. (2004). Multi-level
Dialogue Act Tags. In Proceedings of the 5th SIGdial Workshop on
Discourse and Dialogue at
HLT-NAACL 2004 (pp. 163–170). Association
for Computational Linguistics. [URL]
De Felice, R., Darby, J., Fisher, A., & Peplow, D. (2013). A
classification scheme for annotating speech acts in a business email corpus. ICAME
Journal,
37
1, 71–105. [URL]
Dhillon, R., Bhagat, S., Carvey, H., & Shriberg, E. (2004). Meeting
Recorder Project: Dialog Act Labeling Guide. ICSI Technical Report TR-04-002. [URL].
Di Eugenio, B., Jordan, P. W., & Pylkkänen, L. (1998). The
COCONUT Project: Dialogue Annotation Manual. ISP Technical Report
98-1, University of Pittsburgh.
Godfrey, J., & Holliman, E. (1997). Switchboard-1
Release 2. Linguistic Data Consortium. [URL]
Hyland, K. (2005). Metadiscourse:
Exploring Interaction in Writing. Continuum.
Irie, Y., Matsubara, S., Kawaguchi, N., Yamaguchi, Y., & Inagaki, Y. (2006). Layered
speech-act annotation for spoken dialogue
corpus. In LREC 2006 (pp. 1584–1589). [URL]
ISO 24617-2. (2012). ISO DIS 24617-2
Language resource management – Semantic annotation framework (SemAF), Part 2: Dialogue
acts. Geneva.
Jurafsky, D. (2004). Pragmatics
and computational linguistics. In L. R. Horn & G. Ward. (Eds.), The
Handbook of
Pragmatics (pp. 578–604). Blackwell.
Jurafsky, D., Shriberg, E., & Biasca, D. (1997). Switchboard
SWBD-DAMSL shallow-discourse-function annotation. Coders manual, draft 13. University of Colorado at
Boulder & +SRI International. [URL]
Kang, S., Kim, H., & Seo, J. (2010). A
reliable multidomain model for speech act classification. Pattern Recognition
Letters,
31
1, 71–74.
Kirk, J. M. (2013). Beyond
the structural levels of language: An introduction to the SPICE-Ireland corpus and its
uses. In J. Cruickshank & R. McColl Millar. (Eds.), After
the Storm: Papers from the Forum for Research on the Languages of Scotland and Ulster Triennial
Meeting (pp. 207–232). Forum
for Research on the Languages of Scotland and Ireland. [URL]
Klein, M. (1999). An
overview of the state of the art of coding schemes for dialogue act annotation. Lecture Notes
in Computer
Science,
1
(1692), 274–279.
Klein, M., Bernsen, N. O., Davies, S., Dybkjær, Garrido, J., Kasch, H., Mengel, A., Pirrelli, V., Poesio, M., Quazza, S., & Soria, C. (1998). MATE
Deliverable D1.1: Supported Coding Schemes. 4. Dialogue Acts. [URL]
Leech, G., & Weisser, M. (2003). Generic
speech act annotation for task-oriented dialogues. In D. Archer, P. Rayson, A. Wilson, & T. McEnery. (Eds.), Proceedings
of the Corpus Linguistics 2003 Conference. Lancaster University, UCREL Technical Papers, vol. 161. [URL]
Leech, G., Weisser, M., Wilson, A., & Grice, M. (2000). Survey
and guidelines for the representation and annotation of
dialogue. In D. Gibbon, I. Mertins, & R. Moore. (Eds), Handbook
of Multimodal and Spoken Language
Systems (pp. 10–11). Kluwer.
Levinson, S. C. (1983). Pragmatics. Cambridge University Press.
Levinson, S. C. (2017). Speech
acts. In Y. Huang. (Ed.), The
Oxford Handbook of
Pragmatics (pp. 199–216).
McAllister, P. G. (2015). Speech
acts: A synchronic perspective. In K. Aijmer & C. Rühlemann. Corpus
Pragmatics: A
Handbook (pp. 29–51). Cambridge University Press.
Meteer, M. (1995). Dysfluency
Annotation Stylebook for the Switchboard Corpus. University of Pennsylvania.
Morris, C. W. (1938). Foundations
of the theory of signs. In O. Neurath, R. Carnap, & C. Morris. (Eds.), International
Encyclopedia of Unified
Science (pp. 77–138). University of Chicago Pess.
Park, J., & Kim, Y. (2018). A
novel speech-act coding scheme to visualize the intention of crew communications to cope with simulated off-normal conditions
of nuclear power plants. Reliability Engineering and System
Safety,
178
1, 236–246.
Qadir, A., & Riloff, E. (2011). Classifying
sentences as speech acts in message board posts. In Proceedings of
the 2011 Conference on Empirical Methods in Natural Language
Processing (pp. 748–758). Association
for Computational Linguistics. [URL]
Searle, J. R. (1979). Expression
and Meaning: Studies in the Theory of Speech Acts. Cambridge University Press.
Vail, A. K., & Boyer, K. E. (2014). Identifying
effective moves in tutoring: On the refinement of dialogue act annotation
schemes. In S. Trausan-Matu, K. Elizabeth Boyer, M. Crosby, & Kitty Panourgia. (Eds.), ITS
2014,
LNCS
8474
1 (pp. 199–209). Springer.
Verdonik, D., Kosem, I., Zwitter Vitez, A., Krek, S., & Stabej, M. (2013). Compilation,
transcription and usage of a reference speech corpus: The case of the Slovene corpus
GOS. Language Resources and Evaluation
Journal,
47
(4), 1031–1048.
Weisser, M. (2014). Speech
act annotation. In K. Aijmer & C. Rühlemann. (Eds.), Corpus
Pragmatics: A
Handbook (pp. 84–113). Cambridge University Press.
Weisser, M. (2016). DART –
The dialogue annotation and research tool. Corpus Linguistics and Linguistic
Theory,
12
(2), 355–388.
Weisser, M. (2019a). The
DART Taxonomy v. 3. [URL]
Weisser, M. (2019b). The
DART annotation scheme: Form, applicability & application. Studia
Neophilologica,
91
(2), 131–153.
Zhao, T., & Kawahara, T. (2019). Joint
dialog act segmentation and recognition in human conversations using attention to dialog
context. Computer Speech &
Language,
57
1, 108–127. [URL].
Cited by (1)
Cited by one other publication
Ciambella, Fabio
2024.
Teaching English as a Second Language with Shakespeare,
This list is based on CrossRef data as of 4 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.