Chapter 7
Frequential test of (S)OV as unmarked word order in Dutch and German clauses
A serendipitous corpus-linguistic experiment
Gerard Kempen | Max Planck Institute for Psycholinguistics, Nijmegen | Cognitive Psychology Unit, Leiden University
Karin Harbusch | Department of Computer Science, University of Koblenz-Landau
In a paper entitled “Against markedness (and what to replace it with)”, Haspelmath argues “that the term ‘markedness’ is superfluous”, and that frequency asymmetries often explain structural (un)markedness asymmetries (Haspelmath 2006). We investigate whether this argument applies to Object and Verb orders in main (VO, marked) and subordinate (OV, unmarked) clauses of spoken and written German and Dutch, using English (without VO/OV alternation) as control. Frequency counts from six treebanks (three languages, two output modalities) do not support Haspelmath’s proposal. However, they reveal an unexpected phenomenon, most prominently in spoken Dutch and German: a small set of extremely high-frequent finite verbs with unspecific meanings populates main clauses much more densely than subordinate clauses. We suggest these verbs accelerate the start-up of grammatical encoding, thus facilitating sentence-initial output fluency.
Article outline
- 1.Introduction
- 2.Methodology
- 3.Three frequential tests
- 4.Discussion: Time and fluency pressures can boost VO:OV ratios
-
Notes
-
References
-
Appendix
References (20)
References
Beek, Leonoor van der, Gosse Bouma, Robert Malouf & Gertjan van Noord. 2002. The Alpino Dependency Treebank. In Tanja Gaustad (ed.), Computational Linguistics in the Netherlands 2001. Amsterdam: Rodopi. 

Brants, Sabine, Stefanie Dipper, Peter Eisenberg, Silvia Hansen-Schirra, Esther König, Wolfgang Lezius, Christian Rohrer, George Smith & Hans Uszkoreit. 2004. TIGER: Linguistic Interpretation of a German Corpus. Research on Language and Computation 2. 597–620. 

Charniak, Eugene, Don Blaheta, Niyu Ge, Keith Hall, John Hale & Mark Johnson. 2000. BLLIP 1987–89 WSJ Corpus Release 1 LDC2000T43. DVD. Philadelphia: Linguistic Data Consortium.
Drach, Erich. 1937. Grundgedanken der deutschen Satzlehre. Frankfurt am Main: Diesterweg. [Reprinted in 1963]
Eerten, Laura van. 2007. Over het Corpus Gesproken Nederlands. Nederlandse Taalkunde, 12. 194–215.
Godfrey, John J., Eduard C. Holliman & Jane McDaniel. 1992. SWITCH-BOARD: Telephone speech corpus for research and development. In Proceedings of the International Conference on Audio, Speech and Signal Processing (ICASSP-92), 517–520.
Haspelmath, Martin. 2006. Against markedness (and what to replace it with). Journal of Linguistics 42. 25–70. 

Haider, Hubert. 2010. Wie wurde Deutsch OV? Zur diachronen Dynamik eines Strukturparameters der germanischen Sprachen. In Arne Ziegler (ed.), Historische Textgrammatik und Historische Syntax des Deutschen – Traditionen, Innovationen, Perspektiven, 11–32. Berlin: De Gruyter. 

Höhle, Tilman N. 1986. Der Begriff ‘Mittelfeld’: Anmerkungen über die Theorie der topologischen Felder. In Walter Weiss, Herbert E. Wiegand & Marga Reis (eds.), Akten des VII. Internationalen Germanistenkongresses, 329–340. Tübingen: Niemeyer.
Hoekstra, Heleen, Michael Moortgat, Ineke Schuurman & Ton van der Wouden. 2001. Syntactic annotation for the spoken Dutch corpus project (CGN). Language and Computers 37(1). 73–87.
Kempen, Gerard & Karin Harbusch. 2016. Verb-second word order after German weil ‘because’: Psycholinguistic theory from corpus-linguistic data. Glossa: a journal of general linguistics 1(1). 1–32. 

König, Esther & Wolfgang Lezius. 2003. The TIGER language: A Description Language for Syntax Graphs, Formal Definition. Stuttgart: University of Stuttgart.
Koster, Jan. 1975. Dutch as an SOV Language. Linguistic analysis 1. 111–136.
MacDonald, Maryellen C., Jessica L. Montag & Silvia P. Gennari. 2016. Are there really syntactic complexity effects in sentence production? A reply to Scontras, et al. (2015). Cognitive Science, 40. 513–518. 

Noord, Gertjan van, Gosse Bouma, Frank van Eynde, Daniël de Kok, Jelmar van der Linde, Ineke Schuurman, Erik Tjong Kim Sang, & Vincent Vandeghinste. 2013. Large scale syntactic annotation of written Dutch: Lassy. In Peter Spyns & Jan Odijk (eds.), Essential Speech and Language Technology for Dutch, 147–164. Springer, Berlin. 

Oostdijk, Nelleke, Martin Reynaert, Véronique Hoste & Ineke Schuurman. 2013. The construction of a 500-million-word reference corpus of contemporary written Dutch. In Peter Spyns & Jan Odijk (eds.), Essential speech and language technology for Dutch, 219–247. Berlin: Springer. 

Stegmann, Rosemary, Heike Telljohann & Erhard W. Hinrichs. 2000. Stylebook for the German Treebank in Verbmobil. Saarbrücken: DFKI Report 239.
Wahlster, Wolfgang (ed.). 2000. Verbmobil: Foundations of speech-to-speech translation. Berlin: Springer. 

Cited by (2)
Cited by two other publications
Kempen, Gerard & Karin Harbusch
2019.
Mutual attraction between high-frequency verbs and clause types with finite verbs in early positions: corpus evidence from spoken English, Dutch, and German.
Language, Cognition and Neuroscience 34:9
► pp. 1140 ff.

This list is based on CrossRef data as of 17 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.