In this paper we report on our recent work in clause alignment for English-Chinese bilingual legal texts using available lexical resources including a bilingual legal glossary and a bilingual dictionary, for the purpose of acquiring examples at various linguistic levels for example-based machine translation. We present our formulation of an appropriate measure for the similarity of a candidate pair of clauses with respect to matched lexical items and the corresponding implementation of an effective algorithm for clause alignment based on this similarity measure. Experimental results show that the similarity measure and the lexical-based clause alignment algorithm, though very simple, are very effective, with a performance of 94.6% alignment accuracy. It confirms our intuition that lexical information gives a reliable indication of correct alignment. The significance of this lexical-based approach lies in both its simplicity and effectiveness.
2023. 2023 8th International Conference on Computer Science and Engineering (UBMK), ► pp. 01 ff.
Ding, Ying, Jun-Hui Li, Zheng-Xian Gong & Guo-Dong Zhou
2020. Word-Pair Relevance Modeling with Multi-View Neural Attention Mechanism for Sentence Alignment. Journal of Computer Science and Technology 35:3 ► pp. 617 ff.
Ding, Ying, Junhui Li, Zhengxian Gong & Guodong Zhou
2021. Improving neural sentence alignment with word translation. Frontiers of Computer Science 15:1
Liu, Dayiheng, Kexin Yang, Qian Qu & Jiancheng Lv
2020. Ancient–Modern Chinese Translation with a New Large Training Dataset. ACM Transactions on Asian and Low-Resource Language Information Processing 19:1 ► pp. 1 ff.
Quan, Xiaojun, Chunyu Kit & Wuya Chen
2018. Collaborative Matching for Sentence Alignment. In Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data [Lecture Notes in Computer Science, 11221], ► pp. 39 ff.
Dai, Guangrong
2016. Corpus Methodology and Design. In Hybridity in Translated Chinese [New Frontiers in Translation Studies, ], ► pp. 53 ff.
Quan, Xiaojun & Chunyu Kit
2015. Towards non-monotonic sentence alignment. Information Sciences 323 ► pp. 34 ff.
Cendejas, Eduardo, Grettel Barceló, Alexander Gelbukh & Grigori Sidorov
2009. Incorporating Linguistic Information to Statistical Word-Level Alignment. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications [Lecture Notes in Computer Science, 5856], ► pp. 387 ff.
Zan, Hongying, Xia Zhang & Ming Fan
2008. 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery, ► pp. 145 ff.
Gelbukh, Alexander, Grigori Sidorov & Liliana Chanona-Hernandez
2007. Lexical-Based Alignment for Reconstruction of Structure in Parallel Texts. In Natural Language Processing and Information Systems [Lecture Notes in Computer Science, 4592], ► pp. 401 ff.
Zan, Hongying, Guocheng Duan & Ming Fan
2007. Third International Conference on Natural Computation (ICNC 2007) Vol V, ► pp. 451 ff.
Gelbukh, Alexander & Grigori Sidorov
2006. Alignment of Paragraphs in Bilingual Texts Using Bilingual Dictionaries and Dynamic Programming. In Progress in Pattern Recognition, Image Analysis and Applications [Lecture Notes in Computer Science, 4225], ► pp. 824 ff.
Gelbukh, Alexander, Grigori Sidorov & José Ángel Vera-Félix
2006. A Bilingual Corpus of Novels Aligned at Paragraph Level. In Advances in Natural Language Processing [Lecture Notes in Computer Science, 4139], ► pp. 16 ff.
Gelbukh, Alexander, Grigori Sidorov & José Ángel Vera-Félix
2006. Paragraph-Level Alignment of an English-Spanish Parallel Corpus of Fiction Texts Using Bilingual Dictionaries. In Text, Speech and Dialogue [Lecture Notes in Computer Science, 4188], ► pp. 61 ff.
Williams, Lawrence
2006. Web-Based Machine Translation as a Tool for Promoting Electronic Literacy and Language Awareness. Foreign Language Annals 39:4 ► pp. 565 ff.
Wang, Xiaojie & Fuji Ren
2005. Chinese-Japanese Clause Alignment. In Computational Linguistics and Intelligent Text Processing [Lecture Notes in Computer Science, 3406], ► pp. 400 ff.
This list is based on CrossRef data as of 5 august 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.