Paraphrase and parallel treebank for the comparison of French and Chinese syntax
This paper proposes to study the contrastive syntax of French and Chinese through the lens of syntactic mismatches, and by making use of parallel treebanks. A syntactic mismatch is the non-similarity between the syntactic structures of one linguistic unit and its translation. Syntactic mismatches are formalized using the notion of paraphrase from the Meaning-Text Theory, which allows for capturing mismatches at different levels of the linguistic description (e.g. Semantic, Deep-Syntactic, and Surface-Syntactic). In this paper, we report in details on the types of paraphrases found in the seed corpus used, demonstrating that the Deep-Syntactic paraphrases constitute the best starting point for our study. Then, we show how, starting from the seed corpus, we semi-automatically constructed a multi-layer parallel treebank with the alignment and annotation of paraphrases.
Article outline
- 1.Introduction
- 2.Background and theoretical framework
- 2.1Contrastive syntax and syntactic mismatches
- 2.2The formalization of syntactic mismatches
- 2.3The Meaning-Text Theory
- 2.3.1Representation levels
- 2.3.2Actancial relations in the Meaning-Text Model
- 2.3.3The modules of the MTM
- 3.French-Chinese paraphrases in a literary corpus
- 3.1Definition of a paraphrase
- 3.2Semantic paraphrase
- 3.2.1Semantic-Propositional paraphrase
- a.Expansion/reduction
- b.Addition/substraction
- 3.2.2Semantic-Communicative paraphrase
- 3.3Deep-Syntactic paraphrase
- 3.3.1Synonymy: L ≡ Syn(L)
- 3.3.2Antonymy: L ≡ Anti(L) + NOT
- 3.3.3Conversion: L ≡ Convijkl(L)
- 3.3.4Derivation: L ≡ Der(L)
- 3.4Surface-Syntactic paraphrase
- 3.4.1Different realization of a DSyntA
- 3.4.2Pronominalization
- 4.French-Chinese multi-layer parallel treebank construction
- 4.1Automatic annotation of surface and deep syntactic layers
- 4.2Deep-Syntactic paraphrases annotation
- 5.Conclusion
- Acknowledgements
- Notes
-
References
References (55)
References
Barnett, J., Mani, I., Martin, P. and Rich, E. 1991. Reversible Machine Translation: What to Do When the Languages Don’t Line up. Proceedings of the Workshop on Reversible Grammars in NLP (ACL ’91). Berkeley, USA, 17 June 1991. Association for Computational Linguistics. 61–70.
Bohnet, B. 2010. Top Accuracy and Fast Dependency Parsing is not a Contradiction. Proceedings of the Twenty-Third International Conference on Computational Linguistics (COLING ’10). Beijing, China, 23–27 August 2010. Tsinghua University Press. 89–97.
Buch-Kromann, M., Korzen, I. and Müller, H. 2009. Uncovering the ‘Lost’ Structure of Translations with Parallel Treebanks. In Methodology, Technology and Innovation in Translation Process Research, F. Alves, S. Göpferich and I. Mees (eds), 199–224. Copenhagen: Samfundslitteratur.
Choi, J. D., Tetreault, J. and Stent, A. 2015. It Depends: Dependency Parser Comparison Using a Web-Based Evaluation Tool. Proceedings of the Fifty-Third Annual Meeting of the Association for Computational Linguistics and the Seventh International Joint Conference on Natural Language Processing. Beijing, China, 26–31 July 2015. Association for Computational Linguistics. 387–396.
Dorr, B. J. 1994. Machine Translation Divergences: A Formal Description and Proposed Solution. Computational Linguistics 20(4): 597–633.
Fradin, B. 1984. Anaphorisation et stéréotypes nominaux. Lingua 641: 325–369.
François, J. 1973. La notion de métataxe chez Tesnière. Analyse critique sur la base de trois travaux de sémantique générative. Documentation et Recherche en Linguistique Allemande Vincennes (DRLAV) 51: 1–45.
Gast, V. 2012. Contrastive Linguistics: Theories and Methods. In Dictionaries of Linguistics and Communication Science: Linguistics Theory and Methodology, B. Kortmann and J. Kabatek (eds). Berlin: Mouton de Gruyter.
Granger, S. 2003. The Corpus Approach: A Common Way Forward for Contrastive Linguistics and Translation Studies? In Corpus-Based Approaches to Contrastive Linguistics and Translation Studies, S. Granger, J. Lerot and S. Petch-Tyson (eds), 17–29. Amsterdam: Rodopi.
Granger, S. 2010. Comparable and Translation Corpora in Cross-Linguistic Research. Design, Analysis and Application. Journal of Shanghai Jiaotong University 21: 14–21.
Iordanskaja, L. and Mel’čuk, I. 2017. Le mot français dans le lexique et dans la phrase. Paris: Hermann.
Jackendoff, R. 1990. Semantic Structures. Cambridge: MIT Press.
James, C. 1980. Contrastive Analysis. London: Longman.
Kameyama, M., Ochitani, R. and Peters, S. 1991. Resolving Translation Mismatches with Information Flow. Proceedings of the Twenty-Ninth Annual Meeting of the Association for Computational Linguistics (ACL ’91). Berkeley, USA, 18–21 June 1991. Association for Computational Linguistics. 193–200.
Knittel, M.-L. 2009. Le statut des compléments du nom [de NP]. Canadian Journal of Linguistics 21: 255–299.
Koch, P. 2003. Metataxe bei Lucien Tesnière. In Dependent und Valenz. Eininternationales Handbuch zeitgenössischer Forschung, V. Ágael (ed.), 144–159. Berlin: De Gruyter.
Li, C. N. and Thompson, S. A. 1981. Mandarin Chinese: A Functional Reference Grammar. Berkeley: University of California Press.
Li, F. 1997. Cross-Linguistic Lexicalization Patterns: Diachronic Evidence from Verb-Complement Compounds in Chinese. Sprachtypologie und Unversalienforschung 31: 229–252.
Liu, M. 1997. Conceptual Basis and Categorial Structure: A Study of Mandarin VR Compounds as a Radial Category. Chinese Language and Linguistics 41: 425–451.
Malblanc, A. 1968. Stylistique comparée du français et de l’allemand. Paris: Didier.
Mel’čuk, I. 1988. Dependency Syntax: Theory and Practice. New-York: SUNY Press.
Mel’čuk, I. 1997. Vers une linguistique Sens-Texte. Leçon Inaugurale. Paris: Collège de France.
Mel’čuk, I. 2009. Dependency in Natural Language. In Dependency in Linguistic Description, A. Polguère and I. Mel’čuk (eds), 1–100. Amsterdam: John Benjamins.
Mel’čuk, I. 2014. The East/South-East Asian Answer to the European Passive. Acta Linguistica Petropolitana 10(3): 451–472.
Mel’čuk, I. and Milićević, J. 2014. Introduction à la linguistique. Vol. 11. Paris: Hermann.
Mel’čuk, I. and Savvina, E. 1978. Toward a Formal Model of Alutor Surface Syntax: Predicative and Completive Constructions. Language Special Issue: 5–39.
Mel’čuk, I. and Wanner, L. 2001. Towards a Lexicographic Approach to Lexical Transfer in Machine Translation. Machine Translation 16(1): 21–87.
Mel’čuk, I. and Wanner, L. 2006. Syntactic Mismatches in Machine Translation. Machine Translation 20(2): 81–138.
Miao, J. 2012. Approches textométriques de la notion de style du traducteur. PhD Thesis, University of Sorbonne Nouvelle.
Milićević, J. 2006. A Short Guide to the Meaning-Text Linguistic Theory. Journal of Koralex 81: 187–233.
Milićević, J. 2007. La paraphrase: Modélisation de la paraphrase langagière. Bern: Peter Lang.
Mille, S., Belz, A., Bohnet, B. and Wanner, L. 2018. Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation. Proceedings of the Eleventh International Conference on Natural Language Generation (INLG ’18). Tilburg, Netherlands, 5–8 November 2018. Association for Computational Linguistics. 199–209.
Nguyen, V. T. É. 2006. Unité lexicale et morphologie en chinois mandarin. Vers l’élaboration d’un Dictionnaire Explicatif et Combinatoire du chinois. PhD Thesis, Montreal University.
Nivre, J., de Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajič, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., Silveira, N., Tsarfaty, R. and Zeman, D. 2016. Universal Dependencies v1: A Multilingual Treebank Collection. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC ’16). Portorož, Slovenia, 23–28 May 2016. European Language Resources Association (ELRA). 1659–1666.
Poiret, R. and Liu, H. 2019. Les dépendants adnominaux prépositionnels en français : Relations syntaxiques de surface dans le syntagme N→SP. Le français moderne 87(2): 259–280.
Polguère, A. 2011. Perspective épistémologique sur l’approche linguistique Sens-Texte. Mémoires de la Société Linguistique de Paris XX1: 79–114.
Polguère, A. 2014. Rection nominale : Retour sur les constructions évaluatives. Travaux de linguistique 68(1): 83–102.
Samuelsson, Y. and Volk, M. 2006. Phrase Alignment in Parallel Treebanks. Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories (LTT ’06). Prague, Czech Republic, 1–2 December 2006. 91–102.
Schmied, J. 2004. Translation Corpora in Contrastive Research, Translation and Language Learning. Tradterm 101: 83–115.
Schmied, J. 2009. Contrastive Corpus Studies. In Corpus Linguistics. An International Handbook, A. Lüdeling and M. Kytö (eds), 1140–1159. Berlin: Mouton de Gruyter.
Schubert, K. 1987. Metataxis. Contrastive Dependency Syntax for Machine Translation. Distributed Language Translation 2. Dordrecht: Foris.
Shi, W. and Wu, Y. 2014. Which Way to Move: The Evolution of Motion Expressions in Chinese. Linguistics 521: 1237–1292.
Sun, Y. 2012. Étude contrastive des ordres des mots et des propositions en français et en chinois. PhD Thesis, Wuhan University.
Talmy, L. 2000. Toward a Cognitive Semantics, Vol. 2, Typology and Process in Concept Structuring. Cambridge, MA: MIT Press.
Tesnière, L. 1959. Éléments de syntaxe structurale. Paris: Klincksieck.
Tiedemann, J. 2012. Parallel Data, Tools and Interfaces in OPUS. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC ’12). Istanbul, Turkey, 21–27 May 2012. European Language Resources Association (ELRA). 2214–2218.
Xiao, R. and McEnery, T. 2010. Corpus-Based Contrastive Studies of English and Chinese. London: Routledge.
Yin, H. 2010. The So-Called Chinese VV compounds: A Continuum between Lexicon and Syntax. Proceedings of the 2010 Annual Conference of the Canadian Linguistic Association (CLA ’10). Montreal, Canada, 29–31 May 2010. 1–10.
Cited by (1)
Cited by one other publication
Yan, Jianwei & Tsy Yih
2023.
Igor Mel'čuk. 2021. Ten studies in Dependency Syntax. Berlin: De Gruyter Mouton. Pp. 444. US $160.99 (hardcover)..
Canadian Journal of Linguistics/Revue canadienne de linguistique 68:4
► pp. 622 ff.
This list is based on CrossRef data as of 11 august 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.