You are here : University of Gloucestershire > Research > Research Repository

Towards an End-to-End Personal Fine-Tuning Framework for AI Value Alignment

Tools

Watson, Eleanor, Viana, Thiago ORCID: https://orcid.org/0000-0001-9380-4611, Zhang, Shujun ORCID: https://orcid.org/0000-0001-5699-2676, Sturgeon, Benjamin and Petersson, Lukas (2024) Towards an End-to-End Personal Fine-Tuning Framework for AI Value Alignment. Electronics, 13 (20). art 4044. doi:10.3390/electronics13204044

[thumbnail of 14476 Watson, El. et al. (2024) Towards an End-to-End Personal Fine-Tuning Framework for AI Value Alignment.pdf]

Preview

Text
14476 Watson, El. et al. (2024) Towards an End-to-End Personal Fine-Tuning Framework for AI Value Alignment.pdf - Published Version
Available under License Creative Commons Attribution 4.0.
Download (5MB) | Preview

Official URL: http://dx.doi.org/10.3390/electronics13204044

Abstract

This study introduces a novel architecture for value, preference, and boundary alignment in large language models (LLMs) and generative AI systems, accompanied by an experimental implementation. It addresses the limitations in AI model trustworthiness stemming from insufficient comprehension of personal context, preferences, and cultural diversity, which can lead to biases and safety risks. Using an inductive, qualitative research approach, we propose a framework for personalizing AI models to improve model alignment through additional context and boundaries set by users. Our framework incorporates user-friendly tools for identification, annotation, and simulation across diverse contexts, utilizing prompt-driven semantic segmentation and automatic labeling. It aims to streamline scenario generation and personalization processes while providing accessible annotation tools. The study examines various components of this framework, including user interfaces, underlying tools, and system mechanics. We present a pilot study that demonstrates the framework’s ability to reduce the complexity of value elicitation and personalization in LLMs. Our experimental setup involves a prototype implementation of key framework modules, including a value elicitation interface and a fine-tuning mechanism for language models. The primary goal is to create a token-based system that allows users to easily impart their values and preferences to AI systems, enhancing model personalization and alignment. This research con-tributes to the democratization of AI model fine-tuning and dataset generation, advancing efforts in AI value alignment. By focusing on practical implementation and user interaction, our study bridges the gap between theoretical alignment approaches and real-world applications in AI systems.

Item Type:	Article
Article Type:	Article
Uncontrolled Keywords:	Machine learning; Annotation; Alignment; Framework; Foundation models
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software Q Science > QA Mathematics > QA76 Computer software > QA76.9 Other topics > QA76.9.H85 Human-computer interaction
Divisions:	Schools and Research Institutes > School of Business, Computing and Social Sciences
Depositing User:	Kamila Niekoraniec
Date Deposited:	24 Oct 2024 10:57
Last Modified:	08 Aug 2025 10:00
URI:	https://eprints.glos.ac.uk/id/eprint/14476

University Staff: Request a correction | Repository Editors: Update this record

Altmetric

CORE (COnnecting REpositories)

University Of Gloucestershire

Find Us On Social Media:

Other University Web Sites

Staffnet (Staff Only)

University of Gloucestershire, The Park, Cheltenham, Gloucestershire, GL50 2RH. Telephone +44 (0)844 8010001.

© UoG 2008-24
Knowledge Base
Accessibility
Privacy and Cookies
Disclaimer
Comments concerning this page to Webmaster