An Ensemble Classification Algorithm based on Semantics for Text Data Streams with Concept Drifts

  • Gang Sun, Zhongxin Wang, Jia Zhao, Hao Wang, Xiaowen Guan

Abstract

How to mine user-interested information from text data streams with concept drifts is one of the hot topics in natural language processing research, therefore, a new ensemble text data streams classification algorithm based on semantics is proposed. The algorithm first uses the minimum redundancy and maximum relevant feature selection method to remove irrelevant features and redundant features in the text data stream; then, uses the topic model calculates the semantic similarity in the text data stream and detects the concept drifts; finally, the ensemble classification model is used to classify the text data stream. Experimental results show that the ensemble classification algorithm proposed in this paper can effectively detect the concept drifts and has good classification performance for text data streams.

Published
2020-01-31
How to Cite
Gang Sun, Zhongxin Wang, Jia Zhao, Hao Wang, Xiaowen Guan. (2020). An Ensemble Classification Algorithm based on Semantics for Text Data Streams with Concept Drifts. Design Engineering, 147 - 154. https://doi.org/10.17762/de.vi.14
Section
Articles