Shahid, Usama ORCID: https://orcid.org/0009-0005-6360-333X, Hussain, Muhammad Zunnurain and Sayers, William
ORCID: https://orcid.org/0000-0003-1677-4409
(2025)
Computational Analysis of Quran Text Using Machine Learning and Large Language Models.
In: 2025 8th International Conference on Data Science and Machine Learning Applications (CDMA), 16-17 February 2025, Riyadh, Saudi Arabia.
ISBN 979-8-3315-3969-6
Preview |
Text (Peer-reviewed version)
14964 Shahid (2025) Computational analysis of Quran text (accepted version).pdf - Accepted Version Available under License Creative Commons Attribution 4.0. Download (1MB) | Preview |
![]() |
Text
14964 Shahid, U., et al (2025) Computational Analysis of Quran Text Using Machine Learning and Large Language Models.pdf - Published Version Restricted to Repository staff only Available under License All Rights Reserved. Download (1MB) |
Abstract
The Quran verses are foundational for Muslims worldwide. Significant research has been dedicated to information retrieval (IR) from Quran; however, multiple studies have focused on descriptive analysis and topic modelling of the Quran in Arabic and translated versions. This study presents a comprehensive framework for analysing large textual data using an English translation of the Quran. Initially, it conducts a descriptive analysis of the verses to uncover various features, including readability, word clouds, significant n-grams, and network graphs illustrating word associations. The framework then applies machine learning techniques, specifically clustering models based on numerical vectors from text-embedding-3-large, to identify effective groupings of verses. Additionally, GPT-4-turbo is used for topic modelling within each cluster through prompt engineering, aiming to enhance the understanding of these clusters. The results include statistical information graphs and concise knowledge summaries that are beneficial to both domain experts and wider populace.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Subjects: | Q Science > QA Mathematics > QA76 Computer software |
Divisions: | Schools and Research Institutes > School of Business, Computing and Social Sciences |
Depositing User: | Kamila Niekoraniec |
Date Deposited: | 11 Apr 2025 13:35 |
Last Modified: | 24 Apr 2025 09:30 |
URI: | https://eprints.glos.ac.uk/id/eprint/14964 |
University Staff: Request a correction | Repository Editors: Update this record