Email Phishing Detection Using Machine Learning Approaches

Akpan, Etoroabasi ORCID logoORCID: https://orcid.org/0009-0007-2084-0591, Mishra, Bhupesh Kumar, Sayers, William ORCID logoORCID: https://orcid.org/0000-0003-1677-4409 and Loukil, Zainab ORCID logoORCID: https://orcid.org/0000-0003-2731-7051 (2025) Email Phishing Detection Using Machine Learning Approaches. In: Intelligent Systems with Applications in Communications, Computing and IoT (ICISCCI 2024). Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering Ser. (621). Springer, Switzerland, pp. 70-85. ISBN 9783031926136

[thumbnail of 15215 Etoroabasi, A. et al (2025) Email Phishing Detection Using Machine.pdf] Text
15215 Etoroabasi, A. et al (2025) Email Phishing Detection Using Machine.pdf - Accepted Version
Restricted to Repository staff only
Available under License All Rights Reserved.

Download (2MB)

Abstract

Phishing attacks significantly threaten individuals and organizations, resulting in substantial liabilities. This study investigates multiple machine learning and deep learning approaches to enhance phishing detection, utilizing 208,704 emails comprising 108,693 legitimate and 99,225 phishing emails. Techniques explored include SVM, RF, LR, LSTM, and Bi-LSTM. NLP techniques, such as FastText embedding and lemmatization, were utilized for email text preprocessing. The models were evaluated using performance metrics such as recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (ROC-AUC). The experimental results showed that the SVM model achieved an F1-score of 95.43%, a recall of 95.43%, and an AUC of 95.36%. The LSTM model obtained an F1-score of 91.76%, a recall of 91.77%, and an AUC of 97.16%. These findings indicate that SVM excels in precision and recall, while LSTM performs better in distinguishing between phishing and non-phishing emails, as evidenced by its higher AUC. This research contributes to cybersecurity by showcasing the effectiveness of advanced machine learning and deep learning models in enhancing phishing email detection, leading to more secure digital communication environments. © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2026.

Item Type: Book Section
Uncontrolled Keywords: Bidirectional Long Short-Term Memory (Bi-LSTM); Cyber Security; Deep Learning (DL); FastText Embedding; Logistic Regression (LR); Long Short-Term Memory (LSTM); Machine Learning (ML); Natural Language Processing (NLP); Phishing; Random Forest (RF); Support Vector Machine (SVM)
Subjects: H Social Sciences > HD Industries. Land use. Labor > HD28 Management. Industrial Management > HD61 Risk in industry. Risk management
Q Science > QA Mathematics > QA76 Computer software
Divisions: Schools and Research Institutes > School of Business, Computing and Social Sciences
Depositing User: Kamila Niekoraniec
Date Deposited: 20 Aug 2025 11:10
Last Modified: 29 Aug 2025 08:00
URI: https://eprints.glos.ac.uk/id/eprint/15215

University Staff: Request a correction | Repository Editors: Update this record

University Of Gloucestershire

Bookmark and Share

Find Us On Social Media:

Social Media Icons Facebook Twitter YouTube Pinterest Linkedin

Other University Web Sites

University of Gloucestershire, The Park, Cheltenham, Gloucestershire, GL50 2RH. Telephone +44 (0)844 8010001.