Distributional semantics offers new ways to study the semantics of morphology. This study focuses on the semantics
of noun singulars and their plural inflectional variants in English. Our goal is to compare two models for the conceptualization
of plurality. One model (FRACSS) proposes that all singular-plural pairs should be taken into account when predicting plural
semantics from singular semantics. The other model (CCA) argues that conceptualization for plurality depends primarily on the
semantic class of the base word. We compare the two models on the basis of how well the speech signal of plural tokens in a large
corpus of spoken American English aligns with the semantic vectors predicted by the two models. Two measures are employed: the
performance of a form-to-meaning mapping and the correlations between form distances and meaning distances. Results converge on a
superior alignment for CCA. Our results suggest that usage-based approaches to pluralization in which a given word’s own semantic
neighborhood is given priority outperform theories according to which pluralization is conceptualized as a process building on
high-level abstraction. We see that what has often been conceived of as a highly abstract concept, [+plural], is better
captured via a family of mid-level partial generalizations.
Amenta, S., Marelli, M., Sulpizio, S. (2017). From
sound to meaning: Phonology-to-Semantics mapping in visual word recognition. Psychonomic
Bulletin and
Review,
24
(
3
), 887–893.
Baayen, R. H., Chuang, Y.-Y., Shafaei-Bajestan, E., Blevins, J. (2019). The
discriminative lexicon: A unified computational mo del for the lexicon and lexical processing in comprehension and production
grounded not in (de)composition but in linear discriminative
learning. Complexity, 1–39.
Baayen, R. H., & Moscoso del Prado Martín, F. (2005). Semantic
density and past-tense formation in three Germanic
languages. Language,
81
1, 666–698.
Boleda, G. (2020). Distributional
Semantics and Linguistic Theory. Annual Review of
Linguistics,
6
1, 213–234. arXiv:1905.01896v4.
Chuang, Y.-Y., Brown, D., Baayen, R. H., Evans, R. (2022). Paradigm
gaps are associated with weird “distributional semantics" properties: Russian defective nouns and their case and number
paradigms. submitted. Retrieved from [URL].
Ciaramita, M., & Johnson, M. (2003). Supersense
tagging of unknown nouns in wordnet. Proceedings of the 2003 conference on empirical methods in
natural language
processing (p. 168–175). USA: Association for Computational Linguistics.
Corbett, G. G. (2000). Number (S. R. Andersonet al., Eds.). Cambridge, UK: Cambridge University Press.
Faraway, J. J. (2005). Linear
models with r. Boca Raton, FL: Chapman & Hall/CRC. Retrieved from [URL]
Fellbaum, C. (1998). WordNet:
An electronic lexical database. Cambridge, MA: MIT Press.
Firth, J. R. (1968). Selected
papers of J. R. Firth, 1952–59. Indiana University Press.
Gallice, G. (2012). Flickr –
ggallice – street dogs (1). Wikimedia Commons. Retrieved 2022-5-31, from [URL] (This file is licensed
under the Creative Commons Attribution 2.0 Generic license.)
Günther, F., Rinaldi, L., Marelli, M. (2019). Vector-Space
Models of Semantic Representation From a Cognitive Perspective: A Discussion of Common
Misconceptions. Perspectives on Psychological
Science,
14
(6), 1006–1033.
Harbour, D. (2008). Morphosemantic
Number: From Kiowa Noun Classes To UG Number Features (1st
ed.). Dordrecht: Springer.
Harbour, D. (2011). Valence
and atomic number. Linguistic
Inquiry,
42
(4), 561–594.
Harris, Z. S. (1954, 8). Distributional
Structure. WORD,
10
(2–3).
Johnson, K. (2004). Massive
reduction in conversational American English. Spontaneous speech: data and analysis.
proceedings of the 1st session of the 10th international
symposium (pp. 29–54). Tokyo, Japan.
Khursheed, O. (2014). Apples
of kashmir valley. Wikimedia Commons. Retrieved 2022-5-31, from [URL] (This file is licensed under the
Creative Commons Attribution-Share Alike 4.0 International license.)
Kiela, D., Bulat, L., Clark, S. (2015). Grounding
semantics in olfactory perception. ACL-IJCNLP 2015 – 53rd Annual Meeting of the Association for
Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of
Natural Language Processing, Proceedings of the
Conference,
2
1, 231–236.
Kiela, D., & Clark, S. (2017). Learning
neural audio embeddings for grounding semantics in auditory perception. Journal of Artificial
Intelligence
Research,
60
1, 1003–1030.
Kisselew, M., Padó, S., Palmer, A., Šnajder, J. (2015, April). Obtaining
a better understanding of distributional models of German derivational
morphology. (pp. 58–63). London, UK: Association for Computational Linguistics.
Landauer, T. K., & Dumais, S. T. (1997). A
solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of
knowledge. Psychological
review,
104
(2), 211–240. .
Levy, O., Kenett, Y. N., Oxenberg, O., Castro, N., De Deyne, S., Vitevitch, M. S., Havlin, S. (2021). Unveiling
the nature of interaction between semantics and phonology in lexical access based on multilayer
networks. Scientific
Reports,
11
(1), 1–14. .
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D. (2014). The
Stanford CoreNLP Natural Language Processing Toolkit. Proceedings of 52nd annual meeting of the
association for computational linguistics: System
demonstrations (pp. 55–60). Baltimore, Maryland: Association for Computational Linguistics.
Marelli, M., Amenta, S., Crepaldi, D. (2015). Semantic
Transparency in Free Stems: The Effect of Orthography-Semantics Consistency on Word
Recognition. Quarterly Journal of Experimental
Psychology,
68
(8), 1571–1583.
Marelli, M., & Baroni, M. (2015). Affixation
in semantic space: Modeling morpheme meanings with compositional distributional
semantics. Psychological
Review,
122
(3), 485–515.
Mikolov, T. (2013, Jul30). word2vec. Google Code Archive. Retrieved 2021-05-28, from [URL]
Mikolov, T., Chen, K., Corrado, G., Dean, J. (2013). Efficient
estimation of word representations in vector space. 1st International Conference on Learning
Representations, ICLR 2013 – Workshop Track
Proceedings, 1–12. arXiv:1301.3781.
Milin, P., Filipović Durdević, D., Moscoso del Prado Martín, F. (2009). The
simultaneous effects of inflectional paradigms and classes on lexical recognition: Evidence from
Serbian. Journal of Memory and
Language,
60
(1), 50–64.
Miller, G. A. (1995). WordNet:
A lexical database for English. Communications of the
ACM,
38
(11), 39–41.
Monaghan, P., Shillcock, R. C., Christiansen, M. H., Kirby, S. (2014). How
arbitrary is language?Philosophical Transactions of the Royal Society B: Biological
Sciences,
369
(1651).
Moore, E. H. (1920). On
the reciprocal of the general algebraic matrix. Bulletin of the American Mathematical
Society,
26
(9), 394–395.
Moscoso del Prado Martín, F., Kostić, A., Baayen, R. H. (2004). Putting
the bits together: An information theoretical perspective on morphological
processing. Cognition,
94
1, 1–18.
Nikolaev, A., Chuang, Y., Baayen, R. H. (2022). A
generating model for finnish nominal inflection using distributional semantics. Accepted for
publication in the Mental Lexicon. Retrieved from [URL].
Ochshorn, R. M., & Hawkins, M. (2015). Gentle:
A robust yet lenient forced aligner built on kaldi. (Available online at [URL]).
Park, J. A. (2013). Spanish
racoon cats. Wikimedia Commons. Retrieved 2022-5-31, from [URL] (This file is licensed under the
Creative Commons Attribution-Share Alike 3.0 Unported license.).
Plag, I., Homann, J., Kunter, G. (2017). Homophony
and morphology: The acoustics of word-final S in English. Journal of
Linguistics,
53
(1), 181–216.
Polomé, E. C. (1967). Swahili
language handbook. Washington, D.C.: Center for Applied Linguistics.
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., … Vesely, K. (2011, December). The
kaldi speech recognition toolkit. IEEE 2011 workshop on automatic speech recognition and
understanding. IEEE Signal Processing Society.
Shafaei-Bajestan, E., Moradipour-Tari, M., Uhrig, P., Baayen, R. H. (2021). LDL-AURIS:
a computational model, grounded in error-driven learning, for the comprehension of single spoken
words. Language, Cognition and Neuroscience.
Shafaei-Bajestan, E., Moradipour-Tari, M., Uhrig, P., Baayen, R. H. (2022). Semantic
properties of english nominal pluralization: Insights from word
embeddings. arXiv. arXiv:2203.15424.
Shahmohammadi, H., Lensch, H. P. A., Baayen, R. H. (2021, November). Learning
zero-shot multifaceted visually grounded word embeddings via multi-task training. Proceedings
of the 25th conference on computational natural language
learning (pp. 158–170). Online: Association for Computational Linguistics.
Shillcock, R., Kirby, S., McDonald, S., Brew, C. (2001). Filled
pauses and their status in the mental lexicon. Proc. ITRW on Disfluency in Spontaneous Speech,
Edinburgh, UK, 29–31 August 2001 (DiSS
2001) (pp. 53–56). Edinburgh, UK: International Speech Communication Association.
Siegelman, N., Rueckl, J. G., Lo, J. C. M., Kearns, D. M., Morris, R. D., Compton, D. L. (2022). Quantifying
the regularities between orthography and semantics and their impact on group- and individual-level
behavior. Journal of Experimental Psychology: Learning Memory and
Cognition,
48
(6), 839–855.
Sinclair, J. (1991). Corpus,
concordance, collocation. Oxford: Oxford University Press.
Tomaschek, F., Plag, I., Ernestus, M., Baayen, R. H. (2019). Modeling
the duration of word-final s in english with naive discriminative learning. Journal of
Linguistics. ([URL])
Uhrig, P. (2018). Newsscape
and the distributed little red hen lab – a digital infrastructure for the large-scale analysis of tv
broadcasts. A.-J. Zwierlein, J. Petzold, K. Böhm, & M. Decker (Eds.), Anglistentag
2017 in regensburg: Proceedings. proceedings of the conference of the german association of university teachers of
english (pp. 99–114). Trier: Wissenschaftlicher Verlag Trier.
Uhrig, P. (2021). Large-Scale
Multimodal Corpus Linguistics – The Big Data Turn (Habilitation thesis, unpublished
manuscript). FAU Erlangen-Nürnberg.
van der Maaten, L., & Hinton, G. (2008). Visualizing
Data using t-SNE. Journal of Machine Learning
Research,
9
(86), 2579–2605. Retrieved
from [URL]
Vyagov, V. (2021). Oranges
(fruits). Wikimedia Commons. Retrieved 2022-5-31, from [URL] (This file is licensed under the Creative
Commons Attribution-Share Alike 4.0 International license.).
Wang, B., Wang, A., Chen, F., Wang, Y., Kuo, C. C. J. (2019). Evaluating
word embedding models: Methods and experimental results. APSIPA Transactions on Signal and
Information
Processing,
8
(1), e19.
Yip, P.-C., & Rimmington, D. (2006). Chinese:
An essential grammar. Routledge.
Cited by (6)
Cited by six other publications
Baayen, R. Harald
2024. The wompom. Corpus Linguistics and Linguistic Theory 20:3 ► pp. 615 ff.
Mujezinović, Erdin, Vsevolod Kapatsinski & Ruben van de Vijver
2024. One Cue's Loss Is Another Cue's Gain—Learning Morphophonology Through Unlearning. Cognitive Science 48:5
Shafaei-Bajestan, Elnaz, Masoumeh Moradipour-Tari, Peter Uhrig & R. Harald Baayen
2024. The pluralization palette: unveiling semantic clusters in English nominal pluralization through distributional semantics. Morphology 34:4 ► pp. 369 ff.
Heitmeier, Maria, Yu-Ying Chuang & R. Harald Baayen
2023. How trial-to-trial learning shapes mappings in the mental lexicon: Modelling lexical decision with linear discriminative learning. Cognitive Psychology 146 ► pp. 101598 ff.
Nikolaev, Alexandre, Yu-Ying Chuang & R. Harald Baayen
This list is based on CrossRef data as of 21 december 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.