The Incoherence of Reflexive AI Governance: An Architectural Theory Across the Threat Spectrum

Sawaf, Khaled (2026) The Incoherence of Reflexive AI Governance: An Architectural Theory Across the Threat Spectrum. [Preprint]

Text (Manuscript as submitted to AI and Ethics (Springer), May 2026.)
Paper5_arxiv_preview.pdf - Submitted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (232kB)

Abstract

Every major approach to AI governance currently in operation has the governance function structurally tied to the entity that it is supposed to govern. A looseness in how governance is defined has enabled the reflexive practice of it. This paper argues that reflexive AI governance does not hold across the full threat landscape that AI systems present. Those threats fall along a continuous spectrum, from contained harms to autonomous actions, and a governance framework must demonstrate structural adequacy at every position on that spectrum, independently and simultaneously. Current frameworks fail to meet this criterion. Each depends on the model provider's cooperation as the load-bearing element of its own governance, and as threats grow more severe, this dependence becomes less reliable. Structural separation of the judgment function from the governed system is a necessary condition for spectrum-wide adequacy. The paper proposes an architecture that meets this requirement without losing the access serious oversight depends on: a judgment layer connected to the system at runtime but structurally outside it, with the system's behavior visible to outside authorities and the layer's judgment decisions open to inspection. Access and independence stop being alternatives.

Export/Citation:

Social Networking:

Share |

Item Type:

Preprint

Creators:

Creators	Email	ORCID
Sawaf, Khaled		0009-0001-5326-6713

Additional Information:

Submitted to AI and Ethics (Springer), AI Agents: Ethics, Safety, and Governance topical collection. Currently under peer review.

Keywords:

AI governance, AI agents, Reflexive governance, Structural separation, Threat spectrum, Design science, AI ethics, AI safety, AI regulation

Subjects:

Specific Sciences > Artificial Intelligence > AI and Ethics
Specific Sciences > Computer Science
Specific Sciences > Artificial Intelligence

Depositing User:

Khaled Sawaf

Date Deposited:

05 May 2026 13:42

Last Modified:

05 May 2026 13:42

Item ID:

29458

Subjects:

Specific Sciences > Artificial Intelligence > AI and Ethics
Specific Sciences > Computer Science
Specific Sciences > Artificial Intelligence

Date:

1 May 2026

URI:

https://philsci-archive.pitt.edu/id/eprint/29458

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

View Item

Search & Browse

Information

The Incoherence of Reflexive AI Governance: An Architectural Theory Across the Threat Spectrum

Abstract

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

ULS D-Scribe

E-Prints

Share

Feeds

Get Alerts for All New Posts