In conversation with Artificial Intelligence: aligning language models with human values

Kasirzadeh, Atoosa (2022) In conversation with Artificial Intelligence: aligning language models with human values. [Preprint]

Preview

Text
Valuealignment_languagemodels_philtechnology.pdf
Download (192kB) | Preview

Abstract

Large-scale language technologies are increasingly used in various forms of communication with humans across different contexts. One particular use case for these technologies is conversational agents, which output natural language text in response to prompts and queries. This mode of engagement raises a number of social and ethical questions. For example, what does it mean to align conversational agents with human norms or values? Which norms or values should they be aligned with? And how can this be accomplished? In this paper, we propose a number of steps that help answer these questions. We start by developing a philosophical analysis of the building blocks of linguistic communication between conversational agents and human interlocutors. We then use this analysis to identify and formulate ideal norms of conversation that can govern successful linguistic communication between humans and conversational agents. Furthermore, we explore how these norms can be used to align conversational agents with human values across a range of different discursive domains. We conclude by discussing the practical implications of our proposal for the design of conversational agents that are aligned with these norms and values.

Export/Citation:

Social Networking:

Share |

Item Type:

Preprint

Creators:

Creators	Email	ORCID
Kasirzadeh, Atoosa	atoosa.kasirzadeh@mail.utoronto.ca

Keywords:

Philosophy of artificial intelligence; large language models; ethics of artificial intelligence; value alignment; AI ethics

Subjects:

Specific Sciences > Artificial Intelligence
General Issues > Ethical Issues
General Issues > Technology

Depositing User:

Dr. Atoosa Kasirzadeh

Date Deposited:

08 Dec 2022 15:31

Last Modified:

08 Dec 2022 15:31

Item ID:

21522

Subjects:

Specific Sciences > Artificial Intelligence
General Issues > Ethical Issues
General Issues > Technology

Date:

2022

URI:

https://philsci-archive.pitt.edu/id/eprint/21522

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

View Item

Search & Browse

Information

In conversation with Artificial Intelligence: aligning language models with human values

Abstract

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

ULS D-Scribe

E-Prints

Share

Feeds

Get Alerts for All New Posts