Akpan, Etoroabasi ORCID: https://orcid.org/0009-0007-2084-0591, Mishra, Bhupesh Kumar, Sayers, William
ORCID: https://orcid.org/0000-0003-1677-4409 and Loukil, Zainab
ORCID: https://orcid.org/0000-0003-2731-7051
(2025)
Email Phishing Detection Using Machine Learning Approaches.
In:
Intelligent Systems with Applications in Communications, Computing and IoT (ICISCCI 2024).
Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering Ser.
(621).
Springer, Switzerland, pp. 70-85.
ISBN 9783031926136
![]() |
Text
15215 Etoroabasi, A. et al (2025) Email Phishing Detection Using Machine.pdf - Accepted Version Restricted to Repository staff only Available under License All Rights Reserved. Download (2MB) |
Abstract
Phishing attacks significantly threaten individuals and organizations, resulting in substantial liabilities. This study investigates multiple machine learning and deep learning approaches to enhance phishing detection, utilizing 208,704 emails comprising 108,693 legitimate and 99,225 phishing emails. Techniques explored include SVM, RF, LR, LSTM, and Bi-LSTM. NLP techniques, such as FastText embedding and lemmatization, were utilized for email text preprocessing. The models were evaluated using performance metrics such as recall, F1-score, and Area Under the Receiver Operating Characteristic Curve (ROC-AUC). The experimental results showed that the SVM model achieved an F1-score of 95.43%, a recall of 95.43%, and an AUC of 95.36%. The LSTM model obtained an F1-score of 91.76%, a recall of 91.77%, and an AUC of 97.16%. These findings indicate that SVM excels in precision and recall, while LSTM performs better in distinguishing between phishing and non-phishing emails, as evidenced by its higher AUC. This research contributes to cybersecurity by showcasing the effectiveness of advanced machine learning and deep learning models in enhancing phishing email detection, leading to more secure digital communication environments. © ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2026.
Item Type: | Book Section |
---|---|
Uncontrolled Keywords: | Bidirectional Long Short-Term Memory (Bi-LSTM); Cyber Security; Deep Learning (DL); FastText Embedding; Logistic Regression (LR); Long Short-Term Memory (LSTM); Machine Learning (ML); Natural Language Processing (NLP); Phishing; Random Forest (RF); Support Vector Machine (SVM) |
Subjects: | H Social Sciences > HD Industries. Land use. Labor > HD28 Management. Industrial Management > HD61 Risk in industry. Risk management Q Science > QA Mathematics > QA76 Computer software |
Divisions: | Schools and Research Institutes > School of Business, Computing and Social Sciences |
Depositing User: | Kamila Niekoraniec |
Date Deposited: | 20 Aug 2025 11:10 |
Last Modified: | 29 Aug 2025 08:00 |
URI: | https://eprints.glos.ac.uk/id/eprint/15215 |
University Staff: Request a correction | Repository Editors: Update this record