Visual Analysis of Text Information Retrieval Based on High Order Markov Model

  • Yayun Fan, Jingjing Feng, Yan Hu
Keywords: Text classification, Hierarchical clustering method, Feature word vector, Markov model

Abstract

With the rapid development and popularization of Internet and the rapid expansion of electronic text information, how to effectively organize and manage these information and quickly find the information users need is a major challenge in the field of information science and technology. As the key technology of processing and organizing a large number of text data, text classification can solve the problem of information clutter to a large extent, and it is convenient for users to accurately locate the required information and shunt information. In this paper, hierarchical clustering method is used to cluster feature word vectors. The results show that the similarity within the cluster is large, and the classification information of the feature words within the cluster is similar, which can represent this kind of feature words with similar classification characteristics. This paper proposes a method of serializing text through hierarchical semantic cluster model, and the text sequence shows the characteristics of state transition. In this paper, the problem of determining the cut-off threshold of clustering iteration and the problem of too much calculation of similarity between feature words are investigated, and the corresponding solutions are given. Experiments show that the semantic cluster model can achieve the purpose of dimensionality reduction and highlighting classification information.
Published
2021-04-18
How to Cite
Yayun Fan, Jingjing Feng, Yan Hu. (2021). Visual Analysis of Text Information Retrieval Based on High Order Markov Model. Design Engineering, 2021(3), 410-421. Retrieved from http://thedesignengineering.com/index.php/DE/article/view/1274
Section
Articles