Paraphrase and parallel treebank for the comparison of French and Chinese syntax
This paper proposes to study the contrastive syntax of French and Chinese through the lens of syntactic mismatches, and by making use of parallel treebanks. A syntactic mismatch is the non-similarity between the syntactic structures of one linguistic unit and its translation. Syntactic mismatches are formalized using the notion of paraphrase from the Meaning-Text Theory, which allows for capturing mismatches at different levels of the linguistic description (e.g. Semantic, Deep-Syntactic, and Surface-Syntactic). In this paper, we report in details on the types of paraphrases found in the seed corpus used, demonstrating that the Deep-Syntactic paraphrases constitute the best starting point for our study. Then, we show how, starting from the seed corpus, we semi-automatically constructed a multi-layer parallel treebank with the alignment and annotation of paraphrases.
Article outline
- 1.Introduction
- 2.Background and theoretical framework
- 2.1Contrastive syntax and syntactic mismatches
- 2.2The formalization of syntactic mismatches
- 2.3The Meaning-Text Theory
- 2.3.1Representation levels
- 2.3.2Actancial relations in the Meaning-Text Model
- 2.3.3The modules of the MTM
- 3.French-Chinese paraphrases in a literary corpus
- 3.1Definition of a paraphrase
- 3.2Semantic paraphrase
- 3.2.1Semantic-Propositional paraphrase
- a.Expansion/reduction
- b.Addition/substraction
- 3.2.2Semantic-Communicative paraphrase
- 3.3Deep-Syntactic paraphrase
- 3.3.1Synonymy: L ≡ Syn(L)
- 3.3.2Antonymy: L ≡ Anti(L) + NOT
- 3.3.3Conversion: L ≡ Convijkl(L)
- 3.3.4Derivation: L ≡ Der(L)
- 3.4Surface-Syntactic paraphrase
- 3.4.1Different realization of a DSyntA
- 3.4.2Pronominalization
- 4.French-Chinese multi-layer parallel treebank construction
- 4.1Automatic annotation of surface and deep syntactic layers
- 4.2Deep-Syntactic paraphrases annotation
- 5.Conclusion
- Acknowledgements
- Notes
-
References
References (55)
References
Barnett, J., Mani, I., Martin, P. and Rich, E. 1991. Reversible Machine Translation: What to Do When the Languages Don’t Line up. Proceedings of the Workshop on Reversible Grammars in NLP (ACL ’91). Berkeley, USA, 17 June 1991. Association for Computational Linguistics. 61–70.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Bohnet, B. 2010. Top Accuracy and Fast Dependency Parsing is not a Contradiction. Proceedings of the Twenty-Third International Conference on Computational Linguistics (COLING ’10). Beijing, China, 23–27 August 2010. Tsinghua University Press. 89–97.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Buch-Kromann, M., Korzen, I. and Müller, H. 2009. Uncovering the ‘Lost’ Structure of Translations with Parallel Treebanks. In Methodology, Technology and Innovation in Translation Process Research, F. Alves, S. Göpferich and I. Mees (eds), 199–224. Copenhagen: Samfundslitteratur.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Choi, J. D., Tetreault, J. and Stent, A. 2015. It Depends: Dependency Parser Comparison Using a Web-Based Evaluation Tool. Proceedings of the Fifty-Third Annual Meeting of the Association for Computational Linguistics and the Seventh International Joint Conference on Natural Language Processing. Beijing, China, 26–31 July 2015. Association for Computational Linguistics. 387–396.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Dorr, B. J. 1994. Machine Translation Divergences: A Formal Description and Proposed Solution. Computational Linguistics 20(4): 597–633.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Fradin, B. 1984. Anaphorisation et stéréotypes nominaux. Lingua 641: 325–369. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
François, J. 1973. La notion de métataxe chez Tesnière. Analyse critique sur la base de trois travaux de sémantique générative. Documentation et Recherche en Linguistique Allemande Vincennes (DRLAV) 51: 1–45. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Gast, V. 2012. Contrastive Linguistics: Theories and Methods. In Dictionaries of Linguistics and Communication Science: Linguistics Theory and Methodology, B. Kortmann and J. Kabatek (eds). Berlin: Mouton de Gruyter.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Granger, S. 2003. The Corpus Approach: A Common Way Forward for Contrastive Linguistics and Translation Studies? In Corpus-Based Approaches to Contrastive Linguistics and Translation Studies, S. Granger, J. Lerot and S. Petch-Tyson (eds), 17–29. Amsterdam: Rodopi. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Granger, S. 2010. Comparable and Translation Corpora in Cross-Linguistic Research. Design, Analysis and Application. Journal of Shanghai Jiaotong University 21: 14–21.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Iordanskaja, L. and Mel’čuk, I. 2017. Le mot français dans le lexique et dans la phrase. Paris: Hermann.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Jackendoff, R. 1990. Semantic Structures. Cambridge: MIT Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
James, C. 1980. Contrastive Analysis. London: Longman.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Kameyama, M., Ochitani, R. and Peters, S. 1991. Resolving Translation Mismatches with Information Flow. Proceedings of the Twenty-Ninth Annual Meeting of the Association for Computational Linguistics (ACL ’91). Berkeley, USA, 18–21 June 1991. Association for Computational Linguistics. 193–200.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Knittel, M.-L. 2009. Le statut des compléments du nom [de NP]. Canadian Journal of Linguistics 21: 255–299. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Koch, P. 2003. Metataxe bei Lucien Tesnière. In Dependent und Valenz. Eininternationales Handbuch zeitgenössischer Forschung, V. Ágael (ed.), 144–159. Berlin: De Gruyter.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Li, C. N. and Thompson, S. A. 1981. Mandarin Chinese: A Functional Reference Grammar. Berkeley: University of California Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Li, F. 1997. Cross-Linguistic Lexicalization Patterns: Diachronic Evidence from Verb-Complement Compounds in Chinese. Sprachtypologie und Unversalienforschung 31: 229–252.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Liu, M. 1997. Conceptual Basis and Categorial Structure: A Study of Mandarin VR Compounds as a Radial Category. Chinese Language and Linguistics 41: 425–451.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Malblanc, A. 1968. Stylistique comparée du français et de l’allemand. Paris: Didier.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. 1988. Dependency Syntax: Theory and Practice. New-York: SUNY Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. 1997. Vers une linguistique Sens-Texte. Leçon Inaugurale. Paris: Collège de France.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. 2009. Dependency in Natural Language. In Dependency in Linguistic Description, A. Polguère and I. Mel’čuk (eds), 1–100. Amsterdam: John Benjamins. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. 2014. The East/South-East Asian Answer to the European Passive. Acta Linguistica Petropolitana 10(3): 451–472.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. and Milićević, J. 2014. Introduction à la linguistique. Vol. 11. Paris: Hermann.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. and Savvina, E. 1978. Toward a Formal Model of Alutor Surface Syntax: Predicative and Completive Constructions. Language Special Issue: 5–39.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. and Wanner, L. 2001. Towards a Lexicographic Approach to Lexical Transfer in Machine Translation. Machine Translation 16(1): 21–87. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mel’čuk, I. and Wanner, L. 2006. Syntactic Mismatches in Machine Translation. Machine Translation 20(2): 81–138. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Miao, J. 2012. Approches textométriques de la notion de style du traducteur. PhD Thesis, University of Sorbonne Nouvelle.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Milićević, J. 2006. A Short Guide to the Meaning-Text Linguistic Theory. Journal of Koralex 81: 187–233.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Milićević, J. 2007. La paraphrase: Modélisation de la paraphrase langagière. Bern: Peter Lang. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Mille, S., Belz, A., Bohnet, B. and Wanner, L. 2018. Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation. Proceedings of the Eleventh International Conference on Natural Language Generation (INLG ’18). Tilburg, Netherlands, 5–8 November 2018. Association for Computational Linguistics. 199–209. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Nguyen, V. T. É. 2006. Unité lexicale et morphologie en chinois mandarin. Vers l’élaboration d’un Dictionnaire Explicatif et Combinatoire du chinois. PhD Thesis, Montreal University.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Nivre, J., de Marneffe, M.-C., Ginter, F., Goldberg, Y., Hajič, J., Manning, C. D., McDonald, R., Petrov, S., Pyysalo, S., Silveira, N., Tsarfaty, R. and Zeman, D. 2016. Universal Dependencies v1: A Multilingual Treebank Collection. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC ’16). Portorož, Slovenia, 23–28 May 2016. European Language Resources Association (ELRA). 1659–1666.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Poiret, R. and Liu, H. 2019. Les dépendants adnominaux prépositionnels en français : Relations syntaxiques de surface dans le syntagme N→SP. Le français moderne 87(2): 259–280.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Polguère, A. 2011. Perspective épistémologique sur l’approche linguistique Sens-Texte. Mémoires de la Société Linguistique de Paris XX1: 79–114.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Polguère, A. 2014. Rection nominale : Retour sur les constructions évaluatives. Travaux de linguistique 68(1): 83–102. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Samuelsson, Y. and Volk, M. 2006. Phrase Alignment in Parallel Treebanks. Proceedings of the Fifth Workshop on Treebanks and Linguistic Theories (LTT ’06). Prague, Czech Republic, 1–2 December 2006. 91–102.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Schmied, J. 2004. Translation Corpora in Contrastive Research, Translation and Language Learning. Tradterm 101: 83–115. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Schmied, J. 2009. Contrastive Corpus Studies. In Corpus Linguistics. An International Handbook, A. Lüdeling and M. Kytö (eds), 1140–1159. Berlin: Mouton de Gruyter.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Schubert, K. 1987. Metataxis. Contrastive Dependency Syntax for Machine Translation. Distributed Language Translation 2. Dordrecht: Foris. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Shi, W. and Wu, Y. 2014. Which Way to Move: The Evolution of Motion Expressions in Chinese. Linguistics 521: 1237–1292. ![DOI logo](https://benjamins.com/logos/doi-logo.svg)
![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Sun, Y. 2012. Étude contrastive des ordres des mots et des propositions en français et en chinois. PhD Thesis, Wuhan University.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Talmy, L. 2000. Toward a Cognitive Semantics, Vol. 2, Typology and Process in Concept Structuring. Cambridge, MA: MIT Press.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Tesnière, L. 1959. Éléments de syntaxe structurale. Paris: Klincksieck.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Tiedemann, J. 2012. Parallel Data, Tools and Interfaces in OPUS. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC ’12). Istanbul, Turkey, 21–27 May 2012. European Language Resources Association (ELRA). 2214–2218.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Xiao, R. and McEnery, T. 2010. Corpus-Based Contrastive Studies of English and Chinese. London: Routledge.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Yin, H. 2010. The So-Called Chinese VV compounds: A Continuum between Lexicon and Syntax. Proceedings of the 2010 Annual Conference of the Canadian Linguistic Association (CLA ’10). Montreal, Canada, 29–31 May 2010. 1–10.![Google Scholar](https://benjamins.com/logos/google-scholar.svg)
Cited by (1)
Cited by one other publication
Yan, Jianwei & Tsy Yih
2024.
Igor Mel'čuk. 2021. Ten studies in Dependency Syntax. Berlin: De Gruyter Mouton. Pp. 444. US $160.99 (hardcover)..
Canadian Journal of Linguistics/Revue canadienne de linguistique ► pp. 1 ff.
![DOI logo](//benjamins.com/logos/doi-logo.svg)
This list is based on CrossRef data as of 5 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.