Quantifying lexical and pronunciation variation between three Arabic varieties*
This paper reports on computational measures of linguistic variation that quantify the lexical and pronunciation variation between three varieties of Arabic, Moroccan Arabic, Egyptian Arabic, and Gulf Arabic. We provide three measures of linguistic variation; all computed based on elicitation of the Swadesh list. The first measure is the lexical variation based on the percentage of noncognate words. The second is another lexical measure that takes into account a pronunciation aspect by considering the IPA transcription of the same word list. The third is a pronunciation measure that computes the variation of the IPA transcription of the cognate words in the Swadesh list. The results of the three measures show that geographically proximate languages are also linguistically closer to each other.
References (30)
References
Almeida, A., & Braun, A. (1986). "Richtig" und"Falsch" in Phonetischer Transkription. Vorschläge zum Vergleich von Transkriptionen mit Beispielen aus deutschen Dialekten. Zeitschrift für Dialektologie und Linguistik, 158-172.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Babitch, R.M., & Lebrun, E. (1989). Dialectometry as computerized agglomerative hierarchical classification analysis. Journal of English Linguistics, 22(1), 83-87. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Berghel, H., & Roach, D. (1996). An extension of Ukkonen's enhanced dynamic programming ASM algorithm. ACM Transactions on Information Systems (TOIS), 14(1), 94-106. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Biadsy, F., Hirschberg, J., & Habash, N. (2009, March). Spoken Arabic dialect identification using phonotactic modeling. In
Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
(pp. 53-61). Association for Computational Linguistics. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
Cadora, F.J. (1979). Interdialectal lexical compatibility in Arabic: An analytical study of the lexical relationships among the major Syro-Lebanese varieties (Vol. 11). Leiden: Brill Archive.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cucchiarini, C. (1993). Phonetic transcription: A methodological and empirical study. PhD Dissertation. Nijmegen: Katholieke Universiteit Nijmegen.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ebobisse, C. (1989). Dialectométrie lexicale des parlers sawabantu. Journal of West African Languages, 19(2), 57-66.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Elfardy, H., & Diab, M. (2013). Sentence Level Dialect Identification in Arabic. In
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics
, (pp. 456-461). Association for Computational Linguistics.
Elsie, R. (1986). Dialect Relationships in Goidelic: A Study in Celtic Dialectology. Hamburg: Helmut Buske.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Fitch, W.M., & Margoliash, E. (1967). Construction of phylogenetic trees. Science, 155(3760), 279-284. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gooskens, C. (2007). The contribution of linguistic factors to the intelligibility of closely related languages. Journal of Multilingual and multicultural development, 28(6), 445-467. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gray, R.D., & Atkinson, Q.D. (2003). Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature, 426(6965), 435-439. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gray, R.D., & Jordan, F.M. (2000). Language trees support the express-train sequence of Austronesian expansion. Nature, 405(6790), 1052-1055. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Heeringa, W.J. (2004). Measuring Dialect Pronunciation Differences Using Levenshtein Distance. Phd Dissertation, University of Groningen.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Hoppenbrouwers, C.A.J., & Hoppenbrouwers, G.A. (2001). De indeling van de Nederlandse streektalen: dialecten van 156 steden en dorpen geklasseerd volgens de FFM. Uitgeverij Van Gorcum.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kessler, B. (1995). Computational dialectology in Irish Gaelic. In
Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
(pp. 60-66). Dublin: Morgan Kaufmann Publishers Inc. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
Kondrak, G., & Sherif, T. (2006). Evaluation of several phonetic similarity algorithms on the task of cognate identification. In
Proceedings of the Workshop on Linguistic Distances
(pp. 43-50). Sydney: Association for Computational Linguistics. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
Kondrak, G. (2009). Identification of cognates and recurrent sound correspondences in word lists. Traitement automatique des langues, 50(2), 201-235.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Levenshtein, Vladimir I. (1966). Binary codes capable of correcting deletions, insertions, and reversals. Cybernetics and Control Theory, 10(8), 707-710.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Navarro, G. (2001). A guided tour to approximate string matching. ACM computing surveys (CSUR), 33(1), 31-88. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Nerbonne, J., & Kretzschmar, W. (2003). Introducing computational techniques in dialectometry. Computers and the Humanities, 37(3), 245-255. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Séguy, J. (1973). La dialectométrie dans l’Atlas linguistique de la Gascogne. Revue de linguistique romane, 37, 1-24.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Serva, M., & Petroni, F. (2008). Indo-European languages tree by Levenshtein distance. EPL (Europhysics Letters), 81(6), 68005. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ukkonen, E. (1983). On approximate string matching. In Foundations of Computation Theory (pp. 487-495). Berlin/Heidelberg: Springer. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Ukkonen, E. (1985). Algorithms for approximate string matching. Information and control, 64(1), 100-118. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Valls, E., Nerbonne, J., Prokic, J., Wieling, M., Clua, E., & Lloret, M.R. (2011). Applying the Levenshtein Distance to Catalan dialects: A brief comparison of two dialectometric approaches. Verba: Anuario Galego de Filoloxía, 39, 35-61.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Vieregge, W.H., Rietveld, A.C., & Jansen, C. (1984). A distinctive feature based system for the evaluation of segmental transcription in Dutch. In
Proceedings of the 10th International Congress of Phonetic Sciences
(pp. 654-659). Dordrecht: Foris Publications.
Wagner, H. (1958). Linguistic atlas and survey of Irish dialects. Dublin: Institute for Advanced Studies.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Wichmann, S., Holman, E.W., Bakker, D., & Brown, C.H. (2010). Evaluating linguistic distance measures. Physica A: Statistical Mechanics and its Applications, 389(17), 3632-3639. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Zaidan, O.F., & Callison-Burch, C. (2012). Arabic dialect identification. Computational Linguistics, 40(1), 171-202. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by (1)
Cited by one other publication
Albirini, Abdulkafi, Eman Saadah & Mohammad T. Alhawary
This list is based on CrossRef data as of 25 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.