Email Phishing Detection Using Machine Learning Approaches

Akpan, Etoroabasi; Mishra, Bhupesh Kumar; Sayers, William; Loukil, Zainab

You are here : University of Gloucestershire > Research > Research Repository

Email Phishing Detection Using Machine Learning Approaches

Tools

Akpan, Etoroabasi ORCID: https://orcid.org/0009-0007-2084-0591, Mishra, Bhupesh Kumar, Sayers, William ORCID: https://orcid.org/0000-0003-1677-4409 and Loukil, Zainab ORCID: https://orcid.org/0000-0003-2731-7051 (2025) Email Phishing Detection Using Machine Learning Approaches. In: Intelligent Systems with Applications in Communications, Computing and IoT (ICISCCI 2024). Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering Ser. (621). Springer, Switzerland, pp. 70-85. ISBN 9783031926136

[thumbnail of 15215 Etoroabasi, A et al. (2025) Email Phishing Detection Using Machine Learning Approaches.pdf]

Text
15215 Etoroabasi, A et al. (2025) Email Phishing Detection Using Machine Learning Approaches.pdf - Accepted Version
Restricted to Repository staff only
Available under License All Rights Reserved.
Download (527kB)

Official URL: https://doi.org/10.1007/978-3-031-92614-3_5

Abstract

Phishing attacks significantly threaten individuals and organizations, resulting in substantial liabilities. This study investigates multiple machine learning and deep learning approaches to enhance phishing detection, utilizing 208,704 emails comprising 108,693 legitimate and 99,225 phishing emails. Techniques explored include SVM, RF, LR, LSTM, and Bi-LSTM. NLP techniques, such as FastText embedding and lemmatization, were utilized for email text preprocessing. The models were evaluated using performance metrics such as recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (ROC-AUC). The experimental results showed that the SVM model achieved an F1-score of 95.43%, a recall of 95.43%, and an AUC of 95.36%. The LSTM model obtained an F1-score of 91.76%, a recall of 91.77%, and an AUC of 97.16%. These findings indicate that SVM excels in precision and recall, while LSTM performs better in distinguishing between phishing and non-phishing emails, as evidenced by its higher AUC. This research contributes to cybersecurity by showcasing the effectiveness of advanced machine learning and deep learning models in enhancing phishing email detection, leading to more secure digital communication environments. © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2026.

Item Type:	Book Section
Uncontrolled Keywords:	Bidirectional Long Short-Term Memory (Bi-LSTM); Cyber Security; Deep Learning (DL); FastText Embedding; Logistic Regression (LR); Long Short-Term Memory (LSTM); Machine Learning (ML); Natural Language Processing (NLP); Phishing; Random Forest (RF); Support Vector Machine (SVM)
Subjects:	H Social Sciences > HD Industries. Land use. Labor > HD28 Management. Industrial Management > HD61 Risk in industry. Risk management Q Science > QA Mathematics > QA76 Computer software
Divisions:	Schools and Research Institutes > School of Business, Computing and Social Sciences
Depositing User:	Kamila Niekoraniec
Date Deposited:	20 Aug 2025 11:10
Last Modified:	24 Dec 2025 08:00
URI:	https://eprints.glos.ac.uk/id/eprint/15215

University Staff: Request a correction | Repository Editors: Update this record

Altmetric

View Altmetric information about this item.

CORE (COnnecting REpositories)

University Of Gloucestershire

Find Us On Social Media:

Other University Web Sites

Staffnet (Staff Only)

University of Gloucestershire, The Park, Cheltenham, Gloucestershire, GL50 2RH. Telephone +44 (0)844 8010001.

© UoG 2008-24
Knowledge Base
Accessibility
Privacy and Cookies
Disclaimer
Comments concerning this page to Webmaster