An inquiry into the semantic transparency and productivity of German particle verbs and derivational affixation
This study addresses the relation between morphological productivity and semantic transparency. Using distributional semantics, we compare German word formation using particles with derivational word formation. We observed that derivational suffixes, but not particles, tend to make strong independent semantic contributions to their carrier words. In two-dimensional t-SNE maps, complex words show clustering by affix, but not by particle. Furthermore, the semantic vectors of suffixed words are predictable from their base words with higher accuracy than is possible for particle verbs. For particle verbs, but not affixed verbs, semantic similarity within the set of complex words correlated negatively with the number of types. Furthermore, only for particle verbs, a greater number of observed types predicted a reduced probability of observing unseen types. We propose that particle verbs primarily serve the onomasiological function of labeling, resulting in relatively idiosyncratic semantic vectors. By contrast, words sharing derivational affixes form distinct clusters in semantic space while maintaining strong and consistent semantic relations with their base words. This enables these words to serve not only as labels, but also allows them to be used with an anaphoric function in discourse.
Article outline
- 1.Introduction
- 2.Productivity
- 3.Word embeddings, particle verbs and affixed words
- 3.1Semantic transparency assessed with constituent vectors
- 3.2Assessing transparency with averaged within-category correlations
- 4.Geometry of semantic transparency in semantic space
- 4.1t-SNE analysis of semantic space
- 4.2t-SNE analysis of shift vectors
- 5.Assessing transparency with functions for conceptualization
- 6.Discussion
- Acknowledgements
- Notes
-
References
References (47)
References
Aronoff, M. (1976). Word Formation in Generative Grammar. MIT Press, Cambridge, Mass.
Baayen, R. H. (1993). On frequency, transparency, and productivity. In Booij, G. E. and van Marle, J., editors, Yearbook of Morphology 1992, pages 181–208. Kluwer Academic Publishers, Dordrecht.
Baayen, R. H. (2001). Word Frequency Distributions. Kluwer Academic Publishers, Dordrecht.
Baayen, R. H. (2005). Data mining at the intersection of psychology and linguistics. In Cutler, A., editor, Twenty-first century psycholinguistics: Four cornerstones, pages 69–83. Erlbaum, Hillsdale, New Jersey.
Baayen, R. H., Chuang, Y.-Y., Shafaei-Bajestan, E., and Blevins, J. (2019). The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de)composition but in linear discriminative learning. Complexity.
Baayen, R. H. and Lieber, R. (1991). Productivity and English derivation: a corpus-based study. Linguistics, 291:801–843.
Baayen, R. H. and Neijt, A. (1997). Productivity in context: a case study of a Dutch suffix. Linguistics, 35:565–587.
Baayen, R. H. and Renouf, A. (1996). Chronicling The Times: Productive Lexical Innovations in an English Newspaper. Language, 721:69–96.
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 51:135–146.
Bonami, O. and Paperno, D. (2018). Inflection vs. derivation in a distributional vector space. Lingue e Linguaggio, 17(2):173–195.
Booij, G. (1977). Dutch morphology: A study of word formation in generative grammar.
Booij, G. E. (2002). The morphology of Dutch. Oxford University Press, Oxford.
Booij, G. E. (2016). Construction morphology. In Hippisley, A. and Stump, G., editors, The Cambridge Handbook of Morphology, pages 424–448. Cambridge University Press, Cambridge.
Corbin, D. (1987). Morphologie derivationelle et structuration du lexique. Niemeyer, Tübingen.
Dressler, W. U., & Ladányi, M. (2000). Productivity in word formation (WF): A morphological approach. Acta Linguistica Hungarica, 471, 103–145.
Dressler, Wolfgang. (2003). Morphological Typology and First Language Acquisition: Some Mutual Challenges.
Fernández-Domínguez, Jesús. (2009). Productivity in English word-formation. An approach to N+N compounding.
Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika, 401:237–264.
Güunther, F. and Marelli, M. (2019). Enter sandman: Compound processing and semantic transparency in a compositional perspective. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45(10):1872.
Kastovsky, D. (1986). Productivity in word formation. Linguistics, 241:585–600.
Kempcke, G. (1965). Die Bedeutungsgruppen der verbalen Kompositionspartikeln an-und auf-in synchronischer und diachronischer Sicht. Beiträge zur Geschichte der deutschen Sprache und Literatur, volume 87.
Kisselew, M., Padó, S., Palmer, A., and Snajder, J. (2015). Obtaining a better understanding of distributional models of german derivational morphology. In Proceedings of the 11th International Conference on Computational Semantics, pages 58–63.
Kliche, F. (2009). Zur Semantik der Partikelverben auf ab. Eine Studie im Rahmen der Diskursepräentationstheorie. PhD thesis, Master’s thesis, Universität Tübingen.
Köper, M., Schulte im Walde, S., Kisselew, M., and Padó, S. (2016). Improving zero-shot-learning for german particle verbs by using training-space restrictions and local scaling. In Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, pages 91–96.
Krijthe, J. H. (2015). Rtsne: T-Distributed Stochastic Neighbor Embedding using Barnes-Hut Implementation. R package version 0.15.
Lakoff, G. and Johnson, M. (1980). Metaphors we live by. University of Chicago Press, Chicago.
Landauer, T. and Dumais, S. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2):211–240.
Lechler, A. and Roßdeutscher, A. (2009). Analysing german verb-particle constructions with’ auf ’within a drt based framework.
Lieber, R. (2010). Introducing Morphology. Cambridge University Press, Cambridge, UK.
Lieber, R. and Baayen, R. H. (1993). Verbal prefixes in Dutch: a study in lexical conceptual structure. In Booij, G. E. and Marle, J. V., editors, Yearbook of Morphology 1993, pages 51–78. Kluwer Academic Publishers, Dordrecht.
Maaten, L. V. D. and Hinton, G. (2008). Visualizing data using t-sne. Journal of machine learning research, 91(Nov):2579–2605.
Marelli, M. and Baroni, M. (2015). Affixation in semantic space: Modeling morpheme meanings with compositional distributional semantics. Psychological Review, 122(3):485.
Möollemann, R. (2016). Implications of german word formation processes for a role and reference grammar approach to morphology. MA thesis, University of Düusseldorf.
Plag, I. (1999). Morphological productivity: structural constraints in English derivation (Topics in English Linguistics 28). Berlin & New York: Mouton de Gruyter.
Plag, I. (2003). Word Formation in English. Cambridge University Press, Cambridge, UK.
Riddle, E. (1985). A historical perspective on the productivity of the suffixes -ness and -ity
. In Fisiak, J., editor, Historical Semantics, Historical Word-Formation, pages 435–461. Mouton, New York.
Schreuder, R. and Baayen, R. H. (1994). Prefix-stripping re-revisited. Journal of Memory and Language, 331:357–375.
Schultink, H. (1961). Produktiviteit als Morfologisch Fenomeen. Forum der Letteren 21, 110–125.
Shafaei-Bajestan, E., Moradipour-Tari, M., Uhrig, P., and Baayen, R. H. (2022a). Semantic properties of english nominal pluralization: Insights from word embeddings. arXiv preprint arXiv:2203.15424.
Shafaei-Bajestan, E., Moradipour-Tari, M., Uhrig, P., and Baayen, R. H. (2022b). Semantic properties of English nominal pluralization: Insights from word embeddings. arXiv arxiv. org/abs/ 2203. 15424v1.
Shahmohammadi, H., Lensch, H., and Baayen, R. H. (2021). Learning zero-shot multifaceted visually grounded word embeddings via multi-task training. CoNLL 2021. arXiv preprint arXiv:2104.07500.
Shen, T. and Baayen, H. R. (2022). Productivity and semantic transparency: An exploration of word formation in Mandarin Chinese. The Mental Lexicon.
Shen, T. and Baayen, R. H. (2021). Adjective-noun compounds in Mandarin: a study on productivity. Corpus Linguistics and Linguistic Theory.
Springorum, S., Utt, J., and Im Walde, S. S. (2013). Regular meaning shifts in german particle verbs: A case study. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)-Long Papers, pages 228–239.
Cited by (3)
Cited by three other publications
Baayen, R. Harald
2024.
The wompom.
Corpus Linguistics and Linguistic Theory 20:3
► pp. 615 ff.
Lázaro, Miguel, Teresa Simón, Ainoa Escalonilla & Trinidad Ruiz
2024.
Mind the suffix: Pseudoword processing in children and adults.
Journal of Experimental Child Psychology 245
► pp. 105977 ff.
Shafaei-Bajestan, Elnaz, Masoumeh Moradipour-Tari, Peter Uhrig & R. Harald Baayen
2024.
The pluralization palette: unveiling semantic clusters in English nominal pluralization through distributional semantics.
Morphology 34:4
► pp. 369 ff.
This list is based on CrossRef data as of 21 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.