HypoTerm is a data-driven semantic relation finder that starts from a list of automatically extracted domain- and user-specific terms from technical corpora, and generates a list of relations between these terms. This research study focused on the detection of hypernym relations between relevant terms and named entities. In order to detect all relevant hypernym relations in technical texts, we combined a lexico-syntactic pattern-based approach and a morpho-syntactic analyzer. To evaluate our relation finder, we constructed and manually annotated gold standard data for the dredging and financial domain in Dutch and English. The experimental results show that the HypoTerm system achieves high precision and recall figures for technical texts when starting from valid domain-specific terms and named entities. Thanks to this data-driven approach, it is possible to take an important step from terminology to concept extraction without using any external lexico-semantic resources.
1999 “Automatic Acquisition of a Hypernym-labeled Noun Hierarchy from Text.” In
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics
, 120–126. Baltimore, MD.
Dagan, I., and K. Church
1994 “Termight: Identifying and Translating Technical Terminology.” In
ANLC ’94 Proceedings of the Fourth Conference on Applied Natural Language Processing
, 34–40. Stuttgart, Germany.
Daille, B
1996 “Study and Implementation of Combined Techniques for Automatic Extraction of Terminology.” In The Balancing Act: Combining Symbolic and Statistical Approaches to Language, ed. by J. Klavans, and P. Resnik, 49–66. Massachusetts: MIT Press.
2013 “A Java Framework for Multilingual Definition and Hypernym Extraction.” In
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics
, 103–108. Sofia, Bulgaria.
Frantzi, K., and S. Ananiadou
1999 “The C-value/NC-value Domain Independent Method for Multi-word Term Extraction.” Journal of Natural Language Processing 6 (3): 145–179.
Hearst, M
1992 “Automatic Acquisition of Hyponyms from Large Text Corpora.” In
Proceedings of the International Conference on Computational Linguistics
, 539–545. Nantes, France.
Hippisley, A., D. Cheng, and K. Ahmad
2005 “The Head-Modifier Principle and Multilingual Term Extraction.” Natural Language Engineering 11 (2): 129–157.
Jurafsky, D., and J.H. Martin
2009Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. 2nd edition. New Jersey: Prentice-Hall.
Justeson, J., and S. Katz
1995 “Technical Terminology: Some Linguistic Properties and An Algorithm for Identification in Text.” Natural Language Engineering 1 (1): 9–27.
Karypis, G
2002CLUTO – A Clustering Toolkit. Technical Report 02-017. Minnesota: University of Minnesota, Department of Computer Science.
Kozareva, Z., and E. Hovy
2010 “Learning Arguments and Supertypes of Semantic Relations Using Recursive Patterns.” In
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL)
, 1482–1491. Uppsala, Sweden.
Lefever, E., M. Van de Kauter, and V. Hoste
2014 “Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch.” In
Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014)
, 490–497. Reykjavik, Iceland.
Lenci, A., and G. Benotto
2012 “Identifying Hypernyms in Distributional Semantic Spaces.” In
Proceedings of the International Conference on Computational Linguistics
, 539–545. Montréal, Canada.
Miller, G.A., R. Beckwith, C. Fellbaum, D. Gross, and K.J. Miller
1990 “Introduction to Wordnet: An Online Lexical Database.” International Journal of Lexicography 3 (4): 235–244.
Mititelu, V
2008 “Hyponymy Patterns. Semi-automatic Extraction. Evaluation and Inter-lingual Comparison.” Text, Speech and Dialogue: Lecture Notes in Computer Science 52461: 37–44.
Navigli, R., and P. Velardi
2010 “Learning Word-Class Lattices for Definition and Hypernym Extraction.” In
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
, 1318–1327. Uppsala, Sweden.
Oakes, M
2005 “Using Hearst’s Rules for the Automatic Acquisition of Hyponyms for Mining a Pharmaceutical Corpus.” In
Proceedings of the Workshop Text Mining Research
, 63–67. Borovets, Bulgaria.
Pantel, P., and D. Ravichandran
2004 “Automatically Labeling Semantic Classes.” In
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics
, 321–328. Boston, MA.
Reiplinger, M., U. Schäfer, and M. Wolska
2012 “Extracting Glossary Sentences from Scholarly Articles: A Comparative Evaluation of Pattern Bootstrapping and Deep Analysis.” In
Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
, 55–65. Jeju Island, Korea.
Ritter, A., S. Soderland, and O. Etzioni
2009 “What is This, Anyway: Automatic Hypernym Discovery.” In
Proceedings of Advances in Neural Information Processing Systems
, 1297–1304. Vancouver, British Columbia, Canada.
Snow, R., D. Jurafsky, and V.A. Ng
2004“Learning Syntactic Patterns for Automatic Hypernym Discovery.” In Proceedings of NIPS 2004, ed. by L.K. Saul, Y. Weiss and L. Bottou, 1297–1304. Cambridge, Mass: MIT Press.
Sparck Jones, K
1985 “Compound Noun Interpretation Problems.” In Computer Speech Processing, ed. by F. Fallside, and W. Woods, 363–381. Englewoord Cliffs, NJ: Prentice Hall.
Stenetorp, P., S. Pyysalom, G. Topic, T. Ohta, S. Ananiadou, and J. Tsujii
2012 “Brat: A Web-based Tool for NLP-Assisted Text Annotation.” In
Proceedings of the Demonstrations Session at EACL 2012
, 102–107. Avignon, France.
Tjong Kim Sang, E., K. Hofmann, and M. de Rijke
2011 “Extraction of Hypernymy Information from Text.” In Theory and Applications of Natural Language Processing, ed. by A. van den Bosch, and G. Bouma, 223–245. Berlin Heidelberg: Springer-Verlag.
Van de Kauter, M., G. Coorman, E. Lefever, B. Desmet, L. Macken, and V. Hoste
2013 “LeTs Preprocess: The Multilingual LT3 Linguistic Preprocessing Toolkit.” Computational Linguistics in the Netherlands Journal 31: 103–120.
Van der Plas, L., and G. Bouma
2005 “Automatic Acquisition of Lexico-semantic Knowledge for Question Answering.” In
Proceedings of the International Joint Conference on Natural Language Processing Workshop on Ontologies and Lexical Resources
. Jeju Island, Korea.
van Rijsbergen, C
1979Information Retrieval. London: Butterworths.
Vossen, P
(ed.)1998EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Norwell, MA, USA: Kluwer Academic Publishers.
Cited by
Cited by 4 other publications
Chersoni, Emmanuele & Chu-Ren Huang
2021. Companion Proceedings of the Web Conference 2021, ► pp. 316 ff.
Horák, Aleš, Vít Baisa, Adam Rambousek & Vít Suchomel
2019. A New Approach for Semi-Automatic Building and Extending a Multilingual Terminology Thesaurus. International Journal on Artificial Intelligence Tools 28:02 ► pp. 1950008 ff.
Lefever, Els
2016. A hybrid approach to domain-independent taxonomy learning. Applied Ontology 11:3 ► pp. 255 ff.
This list is based on CrossRef data as of 8 april 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.