Parallel Bilingual Datasets: A Multimodal Deep Learning Framework for Proficiency and Style Classification

Kesavan, Padmavathi, Lakshmi Travis, Miranda, Aruldoss, Martin and Wynn, Martin G ORCID logoORCID: https://orcid.org/0000-0001-7619-6079 (2026) Parallel Bilingual Datasets: A Multimodal Deep Learning Framework for Proficiency and Style Classification. Multimodal Technologies and Interaction, 10 (5). pp. 1-27. doi:10.3390/mti10050047

[thumbnail of 16235 Padmavathi et al. (2026)  Parallel Bilingual Datasets.pdf]
Preview
Text
16235 Padmavathi et al. (2026) Parallel Bilingual Datasets.pdf - Published Version
Available under License Creative Commons Attribution 4.0.

Download (1MB) | Preview

Abstract

This study presents a multimodal deep learning framework for automatic proficiency and style classification of parallel Bilingual Tamil–Hindi learner data. The proposed system employs a dual-headed neural architecture to simultaneously predict proficiency levels (Basic, Advanced) and stylistic categories (Formal, Literary) using shared feature representations. A curated dataset of bilingual text samples is utilized, along with synthetic speech generated through text-to-speech (TTS) to enable controlled multimodal experimentation. Five deep learning architectures are evaluated under text-only, audio-only, and learnable fusion settings. Experimental findings indicate that text-based models consistently achieve strong performance in both proficiency and style classification tasks. In contrast, the audio-only model demonstrates limited effectiveness, highlighting the constraints of synthetic acoustic features in capturing meaningful linguistic information. The fusion models provide only marginal improvements over text-based approaches, suggesting that textual representations play a dominant role in proficiency and stylistic classification within controlled datasets. These results emphasize the importance of linguistic features over acoustic signals for automated language assessment in low-resource settings. The proposed framework provides a scalable and reproducible approach and offers a foundation for future work incorporating real speech data and more diverse linguistic inputs.

Item Type: Article
Article Type: Article
Uncontrolled Keywords: Multimodal learning; Language proficiency classification; Style classification; Deep learning; Tamil–Hindi dataset Article Metrics
Related URLs:
Subjects: Q Science > Q Science (General) > Q336 Artificial intelligence
Q Science > QA Mathematics > QA76 Computer software > QA76.9 Other topics > QA76.9.A43 Algorithms
Divisions: Schools and Research Institutes > School of Business, Computing and Social Sciences
Depositing User: Martin Wynn
Date Deposited: 06 May 2026 12:38
Last Modified: 06 May 2026 13:00
URI: https://eprints.glos.ac.uk/id/eprint/16235

University Staff: Request a correction | Repository Editors: Update this record

University Of Gloucestershire

Bookmark and Share

Find Us On Social Media:

Social Media Icons Facebook Twitter YouTube Pinterest Linkedin

Other University Web Sites

University of Gloucestershire, The Park, Cheltenham, Gloucestershire, GL50 2RH. Telephone +44 (0)844 8010001.