Parallel Bilingual Datasets: A Multimodal Deep Learning
Framework for Proficiency and Style Classification

Kesavan, Padmavathi; Lakshmi Travis, Miranda; Aruldoss, Martin; Wynn, Martin G

You are here : University of Gloucestershire > Research > Research Repository

Parallel Bilingual Datasets: A Multimodal Deep Learning Framework for Proficiency and Style Classification

Tools

Kesavan, Padmavathi, Lakshmi Travis, Miranda, Aruldoss, Martin and Wynn, Martin G ORCID: https://orcid.org/0000-0001-7619-6079 (2026) Parallel Bilingual Datasets: A Multimodal Deep Learning Framework for Proficiency and Style Classification. Multimodal Technologies and Interaction, 10 (5). pp. 1-27. doi:10.3390/mti10050047

[thumbnail of 16235 Padmavathi et al. (2026) Parallel Bilingual Datasets.pdf]

Preview

Text
16235 Padmavathi et al. (2026) Parallel Bilingual Datasets.pdf - Published Version
Available under License Creative Commons Attribution 4.0.
Download (1MB) | Preview

Official URL: https://doi.org/10.3390/mti10050047

Abstract

This study presents a multimodal deep learning framework for automatic proficiency and style classification of parallel Bilingual Tamil–Hindi learner data. The proposed system employs a dual-headed neural architecture to simultaneously predict proficiency levels (Basic, Advanced) and stylistic categories (Formal, Literary) using shared feature representations. A curated dataset of bilingual text samples is utilized, along with synthetic speech generated through text-to-speech (TTS) to enable controlled multimodal experimentation. Five deep learning architectures are evaluated under text-only, audio-only, and learnable fusion settings. Experimental findings indicate that text-based models consistently achieve strong performance in both proficiency and style classification tasks. In contrast, the audio-only model demonstrates limited effectiveness, highlighting the constraints of synthetic acoustic features in capturing meaningful linguistic information. The fusion models provide only marginal improvements over text-based approaches, suggesting that textual representations play a dominant role in proficiency and stylistic classification within controlled datasets. These results emphasize the importance of linguistic features over acoustic signals for automated language assessment in low-resource settings. The proposed framework provides a scalable and reproducible approach and offers a foundation for future work incorporating real speech data and more diverse linguistic inputs.

Item Type:	Article
Article Type:	Article
Uncontrolled Keywords:	Multimodal learning; Language proficiency classification; Style classification; Deep learning; Tamil–Hindi dataset Article Metrics
Related URLs:	Publisher
Subjects:	Q Science > Q Science (General) > Q336 Artificial intelligence Q Science > QA Mathematics > QA76 Computer software > QA76.9 Other topics > QA76.9.A43 Algorithms
Divisions:	Schools and Research Institutes > School of Business, Computing and Social Sciences
Depositing User:	Martin Wynn
Date Deposited:	06 May 2026 12:38
Last Modified:	30 May 2026 09:30
URI:	https://eprints.glos.ac.uk/id/eprint/16235

University Staff: Request a correction | Repository Editors: Update this record

Altmetric

View Altmetric information about this item.

CORE (COnnecting REpositories)

University Of Gloucestershire

Find Us On Social Media:

Other University Web Sites

Staffnet (Staff Only)

University of Gloucestershire, The Park, Cheltenham, Gloucestershire, GL50 2RH. Telephone +44 (0)844 8010001.

© UoG 2008-24
Knowledge Base
Accessibility
Privacy and Cookies
Disclaimer
Comments concerning this page to Webmaster