A set of well known statistical filtering methods (binomial hypothesis testing, log-likelihood ratio, t-test, thresholds on relative frequencies) is used on Modern Greek and English corpora in order to automatically acquire verb subcategorization frames that are not limited in number and are not known beforehand. As sophisticated linguistic resources and tools are not available for most languages (including Modern Greek), pre-processing of our corpora reaches merely the stage of elementary, intrasentential, non-embedded phrase chunking. By forming, permutating and counting subsets of the verb's neighboring set of phrases, and by applying the statistical filters mentioned previously, valid syntactic frames of verbs are detected. The results achieved were comparable to and, in several cases, better than the ones of previous approaches, even approaches utilizing richer resources. Incorporating the extracted list of frames into a shallow parser, the performance of the latter increases by almost 6%, showing thereby the importance of the acquired knowledge.
2013. Verb Subcategorisation Acquisition for Estonian Based on Morphological Information. In Text, Speech, and Dialogue [Lecture Notes in Computer Science, 8082], ► pp. 583 ff.
EunJooLee
2008. An analysis of corpus-based research on TEFL and applied linguistics.. English Teaching 63:2 ► pp. 283 ff.
KERMANIDIS, KATIA, MANOLIS MARAGOUDAKIS, NIKOS FAKOTAKIS & GEORGE KOKKINAKIS
2008. Learning verb complements for Modern Greek: balancing the noisy dataset. Natural Language Engineering 14:01
Forsberg, Markus, Harald Hammarström & Aarne Ranta
2006. Morphological Lexicon Extraction from Raw Text Data. In Advances in Natural Language Processing [Lecture Notes in Computer Science, 4139], ► pp. 488 ff.
This list is based on CrossRef data as of 5 august 2024. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers.
Any errors therein should be reported to them.