A New Feature Selection Technique for Author Profiling

  • Upendar Para, M. S. Patel

Abstract

Authorprofiling is a technique of determining the profiles like age, gender, fame, occupation, nativity language, location and personality traits of the authors by analysing their texts. Authorship Profiling techniques are extensively used in many applications like forensic analysis, security, marketing, education, reputation management, fake profiles prediction and sentiment analysis. The information extraction of author details from the textual data has become a popular issue in the internet. In this context, researchers are concentrated on anauthor profiling techniques to know the profiles of the authors.The process of prediction of profiles is initially started with identification of suitable stylistic features which differentiates the author writing style. Many researchers extracteddifferent types of stylistic features in their approaches to predict the author profiles. Most of the authors are successful to achieve good accuracies in author profiles prediction but they are not comfortable with the results. The researchers experimented with content based features like N-Grams of character, word and Part of Speech tags and observed that the accuracies are improved for prediction of author profiles.Some researchers used more number of features in their experimentation and they experimented with some feature selection techniques to reduce the feature set size. In this work, the experiment conducted on stylistic features, N-Gram features of character, word and Part Of Speech (POS) N-Grams. It was observed that the accuracies of gender and age prediction are not satisfactory. A new feature selection algorithm is proposed based on the weights of the features and observed that the accuracies are improved with reduced feature set. Bag Of Words model is used for document vectors representationand these document vectors are forwarded to machine learning algorithms. The obtained accuracies for age and gender prediction was promising than most of the popular techniques of Author Profiling.

Published
2021-07-30
How to Cite
Upendar Para, M. S. Patel. (2021). A New Feature Selection Technique for Author Profiling. Design Engineering, 2868 - 2885. Retrieved from http://thedesignengineering.com/index.php/DE/article/view/3033
Section
Articles