Performance Analysis of Machine Learning Classifiers for Detecting Phishing Websites
Abstract
Phishing attacks are now a significant threat to people’s personal lives and their networking environment. The idea is to build a classification methodology and predict whether a website is a phishing website or a legitimate website based on a given set of predictions. The data consists of 5000 phishing URLs obtained from the PhishTank website, while another 5000 legitimate URLs contain Alexa websites provided by the University of New Brunswick. The paper proposes to apply different machine learning algorithms and identifying the best model for phishing detection. Machine learning classifiers are tuned with hyperparameter values and select the models that achieve exceptional performance with high true positives(TP) and low false positives(FP)., CatBoost classifier performed well among various other classifiers by achieving an F1-Score of 87.6 percent and the best detection accuracy of 88.4 percent.