Be positive
Combining DocuScope with non-negative matrix factorization for
topic discovery
This chapter proposes a novel method that deploys
non-negative matrix factorization to extract topic models from
texts. This topic modeling process reveals how terms and DocuScope
Language Action Type Analysis (LATs) align, providing robust
information on what texts are about and how they are organized
rhetorically. Moreover, the non-negative nature of the topics means
that each derived topic can be viewed as a sum of topical features,
which can greatly ease the interpretive process. To elucidate and
benchmark this method, I apply it to a well-known 20
Newsgroups dataset and sample the results.
Article outline
- 1.Introduction
- 2.Non-negative matrix factorization for topic modeling
- 3.Methodology
- 4.Results and discussion
- 5.Conclusions
-
Notes
-
References