News Topic Discovery Method based on Minimum Entropy Principle

  • Hongbin Wang , Jingzhen Ye, Yantuan Xian*, Yafei Zhang

Abstract

As a typical unstructured text, news is one of the most accessible media for people to get social
information. Due to the redundancy and complexity of news on the Internet, automatic, fast and
accurate access to news topics has important application value. Traditional topic discovery
models are either based on probabilistic topic models, or after constructing text representations,
and then using clustering algorithms for topic discovery. Difficulties faced by these methods, one
is that the number of topics is difficult to set in advance, and secondly, the traditional text
representation method lacks semantic understanding of the text, and the general clustering
algorithm itself is not guided by topic discovery. In view of the above problems, this article
according to the features of news text, based on text keywords, with the help of word embedding
rich text vector semantic representation, combined with news entity elements to assist in
constructing a text association graph network, and then using the principle of minimum entropy
layered coding on the graph network Set up random walks to achieve high-density subgraph
mining to obtain news topics. Finally, the feasibility and accuracy of this method are verified by
experiments.

Published
2020-06-30
How to Cite
Hongbin Wang , Jingzhen Ye, Yantuan Xian*, Yafei Zhang. (2020). News Topic Discovery Method based on Minimum Entropy Principle. Design Engineering, 238 - 259. https://doi.org/10.17762/de.vi.458
Section
Articles