Chapter 3
Finding long-distance dependencies in the Lassy Corpus
In this paper, we present the results of searching for long-distance dependencies in an automatically annotated treebank for Dutch. We concentrate on phenomena that have recently been subject to debate, and where conflicting claims have been made regarding the question whether these constructions actually occur with some frequency in spontaneous language use. Long-distance dependencies involving a tensed or infinitival subordinate clause are quite rare and show collocational effects. Resumptive prolepsis and R-pronominal parasitic gaps are outside the scope of the computational grammar. We show that access to syntactic annotation even in such cases helps to find positive examples relatively quickly.
Article outline
- 1.Introduction
- 2.Background
- 3.Non-local dependencies in the Lassy Corpus
- 4.Long-distance dependencies
- 5.Long distance dependencies with non-finite clauses
- 6.Resumptive prolepsis
- 7.R-pronominal parasitic gaps
- 8.Conclusions
-
Notes
-
References
References (25)
References
Augustinus, Liesbeth, Vincent vandeghinste, Ineke Schuurman & Frank van Eynde. submitted. Gretel: A tool for example-based treebank mining. In Jan Odijk (ed.), Clarin nl.
Bennis, Hans. 2000. Adjectives and argument structure. Amsterdam Studies in the Theory and History of Linguistic Science Series 4 27–68.
Bouma, Gosse. submitted. Om-omission in Dutch verbal complements. In Stef Grodelears & Roeland van Hout (eds.), New ways of analyzing syntactic variation, Benjamins.
Bouma, Gosse, Rob Malouf & Ivan Sag. 2001. Satisfying constraints on adjunction and extraction. Natural Language and Linguistic Theory 19. 1–65. 

Broekhuis, Hans, Hans Den Besten, Kees Hoekstra & Jean Rutten. 1995. Infinitival complementation in dutch: On remnant extraposition. The Linguistic Review 12(93–122). 

Chomsky, Noam. 1977. On wh-movement. In Akmajian Adrian Culicover Peter, Wasow Thomas (ed.), Formal syntax, New York: Academic Press.
Coppen, Peter-Arno. 2013. De zin die wij merken dat ook voor linguïstische problemen zorgt. Nederlandse Taalkunde 18(2). 193–203. 

Cremers, Crit. 1983. On two types of infinitival complementation. In Frank Heny (ed.), Linguistic categories: auxiliaries and related puzzles, 169–221. Springer. 

Cremers, Crit. 1999. Dislocation, clustering and disharmony. Resource Logics and Minimalist Grammar (ESSLLI 99).
Cremers, Crit. 2004. Modal merge and minimal move for dislocation and verb clustering. Research on Language and Computation 2(1). 87–103. 

Engdahl, Elisabet. 1983. Parasitic gaps. Linguistics and philosophy 6(1). 5–34. 

Everaert, Martin, Riny Huybrechts, Noam Chomsky, Robert Berwick & Johan Bolhuis. 2015. Structures, not strings: Linguistics as part of the cognitive sciences. Trends in Cognitive Sciences 19. 729–743. 

Gazdar, Gerald, Ewan Klein, Geoffrey Pullum & Ivan Sag. 1985. Generalized phrase structure grammar. Blackwell.
Hoeksema, Jack & Ankelien Schippers. 2012. Diachronic changes in long-distance dependencies. Historical Linguistics 2009: Selected Papers from the 19th International Conference on Historical Linguistics, Nijmegen, 10–14 August 2009 320. 155. 

Hofmeister, Philip & Ivan A Sag. 2010. Cognitive constraints and island effects. Language 86(2). 366–415. 

Huijbregts, Riny. 2016. Binding unleashed. Ms. Utrecht University.
Kaplan, Ronald M. & Annie Zaenen. 1989. Long-distance dependencies, constituent structure and functional uncertainty. In Mark R. Baltin & Anthony S. Kroch (eds.), Alternative conceptions of phrase structure, University of Chicago Press.
van Noord, Gertjan. 2006. At last parsing is now operational. In Piet Mertens, Cedrick Fairon, Anne Dister & Patrick Watrin (eds.), Taln06. verbum ex machina. actes de la 13e conference sur le traitement automatique des langues naturelles, 20–42.
van Noord, Gertjan, Gosse Bouma, Frank van Eynde, Daniel de Kok, Jelmer van der Linde, Ineke Schuurman, Erik Tjong Kim Sang & Vincent Vandeghinste. 2013. Large scale syntactic annotation of written Dutch: Lassy. In Peter Spyns & Jan Odijk (eds.), Essential speech and language technology for dutch: Results by the stevin programme, 147–164. Springer. 

Odijk, Jan. 2015. Linguistic research with PaQu. Computational Linguistics in The Netherlands journal 5. 3–14.
Oostdijk, Nelleke, Martin Reynaert, Veronique Hoste & Ineke Schuurman. 2013. The construction of a 500-million-word reference corpus of contemporary written Dutch. In Peter Spyns & Jan Odijk (eds.), Essential speech and language technology for dutch, 219–247. Springer. 

Rimell, Laura, Stephen Clark & Mark Steedman. 2009. Unbounded dependency recovery for parser evaluation. In Proceedings of the 2009 conference on empirical methods in natural language processing: Volume 2-volume 2, 813–821. Association for Computational Linguistics. 

Salzmann, Martin. 2006. Resumptive prolepsis: A study in indirect a’ dependencies. Leiden: Leiden University dissertation.
Steedman, Mark. 2000. Information structure and the syntax-phonology interface. Linguistic inquiry 31(4). 649–689. 

Verhagen, Arie. 2006. On subjectivity and ‘long distance wh-movement’. In Angeliki Athanasiadou, Costas Canakis & Bert Cornillie (eds.), Subjectification: Various paths to subjectivity, 323–346. Berlin: Mouton de Gruyter. 

Cited by (1)
Cited by one other publication
Bloem, Jelke
2020.
Een corpus waar alle constructies in gevonden zouden moeten kunnen worden?*.
Nederlandse Taalkunde
25:1
► pp. 39 ff.

This list is based on CrossRef data as of 21 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.