We propose a method for the automatic induction of categories of Spanish discourse markers using parallel corpora, based on a quantitative and empirical approach that minimises explicit linguistic knowledge. We conducted the analysis the using a large Spanish-English parallel corpus. First, we…
Presentamos un estudio correlacional de la asociación entre el uso de marcas ortotipográficas y la neologicidad de un término en un corpus de prensa diacrónico. La pregunta es si es posible observar empíricamente un patrón de evolución en el porcentaje de veces que una palabra nueva es marcada…
This paper presents the first results of a new method for terminology extraction based on distributional analysis. The intuition behind the algorithm is that single or multi-word lexical units that refer to specialised concepts will show a characteristic co-occurrence pattern, described as a…
Nazar, Rogelio, Jorge Vivaldi and Leo Wanner 2012 Automatic taxonomy extraction for specialized domains using distributional semanticsTerminology 18:2, pp. 188–225 | Article
This article explores a statistical, language-independent methodology for the construction of taxonomies of specialized domains from noisy corpora. In contrast to proposals that exploit linguistic information by searching for lexico-syntactic patterns that tend to express the hypernymy relation,…