While discourse markers (DMs) and (dis)fluency have been extensively studied in the past as separate phenomena, corpus-based research combining large-scale yet fine-grained annotations of both categories has, however, never been carried out before. Integrating these two levels of analysis, while methodologically challenging, is not only innovative but also highly relevant to the investigation of spoken discourse in general and form-meaning patterns in particular. The aim of this paper is to provide corpus-based evidence of the register-sensitivity of DMs and other disfluencies (e.g. pauses, repetitions) and of their tendency to combine in recurrent clusters. These claims are supported by quantitative findings on the variation and combination of DMs with other (dis)fluency devices in DisFrEn, a richly annotated and comparable English-French corpus representative of eight different interaction settings. The analysis uncovers the prominent place of DMs within (dis)fluency and meaningful association patterns between forms and functions, in a usage-based approach to meaning-in-context.
Aijmer, K. (2013). Understanding Pragmatic Markers: A Variational Pragmatic Approach. Amsterdam/Philadelphia: John Benjamins.
Aijmer, J., & Simon-Vandenbergen, A. -M. (2011). Pragmatic markers. In J. Zienkowski, J. -O. Östman & J. Verschueren (Eds.), Discursive Pragmatics (pp. 223–247). Amsterdam/Philadelphia: John Benjamins.
Beeching, K. (2013). A parallel corpus approach to investigating semantic change. In K. Aijmer & B. Altenberg (Eds.), Advances in Corpus-based Contrastive Linguistics. Studies in Honour of Stig Johansson (pp. 103–125). Amsterdam/Philadelphia: John Benjamins.
Beliao, J., & Lacheret, A. (2013). Disfluency and discursive markers: When prosody and syntax plan discourse. In R. Eklund (Ed.), Proceedings of Disfluency in Spontaneous Speech (DiSS) 2013. TMH-QPSR, 54(1), 5–8.
Besser, J., & Alexandersson, J. (2007). A comprehensive disfluency model for multi-party interaction. In S. Keizer, H. Bunt & T. Paek (Eds.), Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue (pp. 182–189).
Bortfeld, H., Leon, S., Bloom, J., Schober, M., & Brennan, S. (2001). Disfluency rates in conversation: Effects of age, relationship, topic, role and gender. Language and Speech, 44(2), 123–147.
Boula de Mareüil, P., Adda, G., Adda-Decker, M., Barras, C., Habert, B., & Paroubek, P. (2013). Une étude quantitative des marqueurs discursifs, disfluences et chevauchements de parole dans des interviews politiques. TIPA Travaux Interdisciplinaires sur la Parole et le Langage, 291.
Bouraoui, J. -L., & Vigouroux, N. (2006). Étude de dysfluences dans un corpus linguistiquement contraint. In Proceedings of the Journée d’Etudes sur la Parole (JEP 2006) (pp. 429–432).
Brognaux, S., Roekhaut, S., Drugman, T., & Beaufort, R. (2012). Train&Align: A new online tool for automatic phonetic alignment. In Proceedings of IEEE Spoken Language Technology Workshop (SLT) (pp. 416–421).
Candéa, M. (2000). Contribution à l’Etude des Pauses Silencieuses et des Phénomènes Dits “d’Hésitation” en Français Oral Spontané (Unpublished doctoral dissertation). Université Paris III, Paris.
Crible, L. (2014). Identifying and Describing Discourse Markers in Spoken Corpora. Annotation Protocol v.8 (Technical report). Louvain-la-Neuve, Université catholique de Louvain.
Crible, L., & Degand, L. (forthcoming). Reliability vs. granularity in discourse annotation: What is the trade-off?Corpus Linguistics and Linguistic Theory.
Crible, L., Dumont, A., Grosman, I., & Notarrigo, I. (2016). Annotation Manual of Fluency and Disfluency Markers in Multilingual, Multimodal, Native and Learner Corpora. Version 2.0 (Technical report). Louvain-la-Neuve & Namur, Université catholique de Louvain & Université de Namur.
Degand, L., Martin, L., & Simon, A. -C. (2014). LOCAS-F: Un corpus oral multigenres annoté. Paper presented at the Congrès Mondial de Linguistique Française, Berlin, Germany.
Demirşahin, I., & Zeyrek, D. (2014). Annotating discourse connectives in spoken Turkish. In L. Levin & M. Stede (Eds.), LAW VIII – The 8th Linguistic Annotation Workshop (pp. 105–109).
Denke, A. (2009). Nativelike Performance. Pragmatic Markers, Repair and Repetition in Native and Non-native English Speech. Saarbrücken: Verlag Dr. Müller.
Dister, A., Francard, M., Hambye, P., & Simon, A. -C. (2009). Du corpus à la banque de données. Du son, des textes et des métadonnées. L’évolution de la banque de données textuelles orales VALIBEL (1989–2009). Cahiers de Linguistique, 33(2), 113–129.
Ejzenberg, R. (2000). The juggling act of oral fluency: A psycho-sociolinguistic metaphor. In H. Riggenbach (Ed.), Perspectives on Fluency (pp. 288–313). Ann Arbor: The University of Michigan Press.
Eklund, R. (2004). Disfluency in Swedish Human-human and Human-machine Travel Booking Dialogues (Unpublished doctoral dissertation). Linköpings Universitet, Linköping.
Eklund, R., & Shriberg, E. (1998). Crosslinguistic disfluency modeling: A comparative analysis of Swedish and American English human-human and human-machine dialogs. In R. H. Mannell & J. Robert-Ribes (Eds.), Proceedings of the 5th International Conference on Spoken Language Processing (pp. 2627–2630). Canberra: Australian Speech Science and Technicology Association, Incorporated (ASSTA).
Gilquin, G. (2006). The place of prototypicality in corpus linguistics. Causation in the hot seat. In S. Gries & A. Stefanowitsch (Eds.), Corpora in Cognitive Linguistics: Corpus-based Approaches to Syntax and Lexis (pp. 159–191). Berlin: Mouton de Gruyter.
Gilquin, G., & Gries, S. (2009). Corpora and experimental methods: A state-of-the-art review. Corpus Linguistics and Linguistic Theory, 5(1), 1–26.
Goldman, J. -P., Prsir, T., & Auchlin, A. (2014). C-PhonoGenre: A 7-hour corpus of 7 speaking styles in French: Relations between situational features and prosodic properties. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk & S. Piperidis (Eds.), Proceedings of the 9th Language Resources and Evaluation Conference (LREC’14) (pp. 302–305). Paris, European Language Resources Association (ELRA).
González, M. (2005). Pragmatic markers and discourse coherence relations in English and Catalan oral narrative. Discourse Studies, 77(1), 53–86.
Grosjean, F., & Deschamps, A. (1975). Analyse contrastive des variables temporelles de l’anglais et du français: Vitesse de parole et variables composantes, phénomènes d’hésitation. Phonetica, 31(3–4), 144–184.
Hansen, M. -B. M. (2006). A dynamic polysemy approach to the lexical semantics of discourse markers (with an exemplary analysis of French toujours). In K. Fischer (Ed.), Approaches to Discourse Particles (pp. 21–41). Amsterdam: Elsevier.
Kemmer, S., & Barlow, M. (2000). Introduction: A usage-based conception of language. In M. Barlow & S. Kemmer (Eds.), Usage Based Models of Language (pp. vii–xxviii). Stanford: CSLI.
Kohn, K. (2012). Pedagogic corpora for content and language integrated learning. Insights from the BACKBONE project. The Eurocall Review, 20(2), 1–22.
Kunz, K., & Lapshinova-Koltunski, E. (2015). Cross-linguistic analysis of discourse variation across registers. Nordic Journal of English Studies, 14(1), 258–288.
Lacheret, A., Kahane, S., & Pietrandrea, P. (Eds.) (2014). Rhapsodie: A Prosodic and Syntactic Treebank for Spoken French. Amsterdam/Philadelphia: John Benjamins.
Lopes, A., Martins de Matos, D., Cabarrão, V., Ribeiro, R., Moniz, H., Trancoso, I., & Mata, A. I. (2015). Towards using machine translation techniques to induce multilingual lexica of discourse markers. Computing Research Repository (CoRR), 1–6 [URL] (last accessed August 2017).
Meteer, M.Taylor, A., MacIntyre, R., & Iver, R. (1995). Disfluency Annotation Stylebook for the Switchboard Corpus (Technical report). Linguistic Data Consortium. Philadelphia, PA, University of Pennsylvania.
Palisse, S. (1997). “Artisans”, “Assureurs”, Conversations Téléphoniques en Entreprise. Retrieved from [URL] (last accessed March 2014).
Pallaud, B., Rauzy, S., & Blâche, P. (2013). Auto-interruptions et disfluences en français parlé dans quatre corpus du CID. TIPA Travaux Interdisciplinaires sur la Parole et le Langage, 291, 2–19.
Pawley, A., & Syder, F. (2000). The one-clause-at-a-time hypothesis. In H. Riggebbach (Ed.), Perspectives on Fluency (pp. 163–199). Ann Arbor: The University of Michigan Press.
Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., & Webber, B. (2008). The Penn Discourse TreeBank 2.0. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis & D. Tapias (Eds.), Proceedings of the 6th Language Resources and Evaluation Conference (LREC’08) (pp. 2961–2968). Paris, European Language Resources Association (ELRA).
Roekhaut, S., Brognaux, S., Beaufort, R., & Dutoit, T. (2014). eLite-HTS: Un outil TAL pour la génération de synthèse HMM en français. Paper presented at the Journées d’Etude de la Parole (JEP), Le Mans, France.
Rühlemann, C., & O’Donnell, M. (2012). Introducing a corpus of conversational stories. Construction and annotation of the Narrative Corpus. Corpus Linguistics and Linguistic Theory, 8(2), 313–350.
Schegloff, E., Jefferson, G., & Sacks, H. (1977). The preference for self-correction in the organization of repair in conversation. Language, 53(2), 361–382.
Schiffrin, D. (1987). Discourse Markers. Cambridge: Cambridge University Press.
Schmid, H. (1997). Probabilistic part-of-speech tagging using decision trees. In D. Jones & H. Somers (Eds.), New Methods in Language Processing (pp. 154–164). London: UCL Press.
Schmid, H. -J. (2010). Does frequency in text instantiate entrenchment in the cognitive system. In D. Glynn & K. Fischer (Eds.), Quantitative Methods in Cognitive Semantics: Corpus-Driven Approaches (pp. 101–133). Berlin: Mouton de Gruyter.
Schourup, L. (1999). Discourse markers. Lingua, 1071, 227–265.
Shriberg, E. (1994). Preliminaries to a Theory of Speech Disfluencies (Unpublished doctoral dissertation). University of California, Berkeley, CA.
Simon, A. -C., Auchlin, A., Avanzi, M., & Goldman, J.-Ph. (2010). Les phonostyles. Une description prosodique des styles de parole en français. In M. Abecassis & G. Ledegen (Eds.), Les Voix des Français. En Parlant, en Ecrivant, vol. 21 (pp. 71–88). Bern: Peter Lang.
Strassel, S. (2003). Simple Metadata Annotation Specification v.5 (Technical report). Linguistic Data Consortium. Philadelphia, PA, University of Pennsylvania.
Tonelli, S., Riccardi, G., Prasad, R., & Joshi, A. (2010). Annotation of discourse relations for conversational spoken dialogs. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner & D. Tapias (Eds.), Proceedings of the 7th Language Resources and Evaluation Conference (LREC’10) (pp. 2084–2090). Paris, European Language Resources Association (ELRA).
Willems, D., & Demol, A. (2006). Vraiment and really in contrast: When truth and reality meet. In K. Aijmer & A. -M. Simon-Vandenbergen (Eds.), Pragmatic Markers in Contrast (pp. 215–235). Amsterdam: Elsevier.
Zikánová, Š., Hajičová, E., Hladká, B., Jínová, P., Mírovský, J., Nedoluzhko, A., Poláková, L., Rysová, K., Rysová, M., & Václ, J. (2015). Discourse and Coherence. From the Sentence Structure to Relations in Text. Prague: Institute of Formal and Applied Linguistics.
Zufferey, S., & Degand, L. (2013). Annotating the meaning of discourse connectives in multilingual corpora. Corpus Linguistics and Linguistic Theory.
Cited by (22)
Cited by 22 other publications
Flinn, Andrea
2023. How Often Do Pauses Occur in Lexical Bundles in Spoken Native English Speech?. Corpus Pragmatics 7:4 ► pp. 303 ff.
Morady Moghaddam, Mostafa
2023. Discourse markers in L2 learners' responses to teacher‐generated compliments during classroom interaction. Foreign Language Annals 56:4 ► pp. 1035 ff.
Niculescu, Oana
2023. Acoustic Correlates of Filler Particles in Romanian Connected Speech. Philologica Jassyensia 38:2 ► pp. 71 ff.
2020. Weak and Strong Discourse Markers in Speech, Chat, and Writing: Do Signals Compensate for Ambiguity in Explicit Relations?. Discourse Processes 57:9 ► pp. 793 ff.
Crible, Ludivine, Ágnes Abuczki, Nijolė Burkšaitienė, Péter Furkó, Anna Nedoluzhko, Sigita Rackevičienė, Giedrė Valūnaitė Oleškevičienė & Šárka Zikánová
2019. Functions and translations of discourse markers in TED Talks: A parallel corpus study of underspecification in five languages. Journal of Pragmatics 142 ► pp. 139 ff.
Crible, Ludivine & Liesbeth Degand
2019. Domains and Functions: A Two-Dimensional Account of Discourse Markers. Discours :24
Cuenca, Maria Josep & Ludivine Crible
2019. Co-occurrence of discourse markers in English: From juxtaposition to composition. Journal of Pragmatics 140 ► pp. 171 ff.
Didirková, Ivana, Ludivine Crible & Anne Catherine Simon
2019. Impact of Prosody on the Perception and Interpretation of Discourse Relations: Studies on “Et” and “Alors” in Spoken French. Discourse Processes 56:8 ► pp. 619 ff.
Haselow, Alexander
2019. Discourse marker sequences: Insights into the serial order of communicative tasks in real-time turn production. Journal of Pragmatics 146 ► pp. 1 ff.
Haselow, Alexander
2021. The acquisition of pragmatic markers in the foreign language classroom: An experimental study on the effects of implicit and explicit learning. Journal of Pragmatics 186 ► pp. 73 ff.
This list is based on CrossRef data as of 19 november 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.