Terminology structuring has been the subject of much work in the context of terms extracted from corpora: given a set of terms, obtained from an existing resource or extracted from a corpus, it consists in identifying hierarchical (or other types of) relations between these terms. The present work aims at assessing the feasibility of such structuring by studying it on an existing hierarchically structured terminology. Our overall goal is to test various structuring methods proposed in the literature and to check how they fare on this task. The specific goal at the present stage of our work, which we report here, is focussed on lexical methods that match terms on the basis on their content words, taking morphological variants and synonyms into account. We describe experiments performed on the French version of the US National Library of Medicine MeSH thesaurus. We compare the lexically-induced relations with the original MeSH relations and measure recall and precision metrics, taking two different views on the task: relation recovery and term placement. This method proposes correct term placement for up to 26% of the MeSH concepts, and its precision can reach 58%. After this quantitative evaluation, we perform a qualitative, human analysis of the ‘new’ relations not present in the MeSH. This analysis shows, on the one hand, the limits of the lexical structuring method. On the other hand, it reveals some specific structuring choices and naming conventions made by the MeSH designers, and emphasizes ontological commitments that cannot be left to automatic structuring.
Drouin, Patrick, Natalia Grabar, Thierry Hamon & Kyo Kageura
2015. Introduction to the Special Issue. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 21:2 ► pp. 139 ff.
Périnet, Amandine & Thierry Hamon
2014. Distributional Context Generalisation and Normalisation as a Mean to Reduce Data Sparsity: Evaluation of Medical Corpora. In Advances in Natural Language Processing [Lecture Notes in Computer Science, 8686], ► pp. 128 ff.
Dupuch, Marie, Christopher Engström, Sergei Silvestrov, Thierry Hamon & Natalia Grabar
2013. Comparison of Clustering Approaches through Their Application to Pharmacovigilance Terms. In Artificial Intelligence in Medicine [Lecture Notes in Computer Science, 7885], ► pp. 58 ff.
Gillam, Lee & Khurshid Ahmad
2005. Pattern Mining Across Domain-Specific Text Collections. In Machine Learning and Data Mining in Pattern Recognition [Lecture Notes in Computer Science, 3587], ► pp. 570 ff.
SanJuan, Eric
2005. Query Refinement Through Lexical Clustering of Scientific Textual Databases. In Natural Language Processing and Information Systems [Lecture Notes in Computer Science, 3513], ► pp. 251 ff.
SanJuan, Eric, James Dowdall, Fidelia Ibekwe-SanJuan & Fabio Rinaldi
2005. A symbolic approach to automatic multiword term structuring. Computer Speech & Language 19:4 ► pp. 524 ff.
This list is based on CrossRef data as of 10 july 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.