A Mechanistic Explanatory Strategy for XAI

Rabiza, Marcin (2024) A Mechanistic Explanatory Strategy for XAI. [Preprint]

There is a more recent version of this item available.

Text
Rabiza - Mechanistic XAI (preprint).pdf - Accepted Version
Download (591kB)

Abstract

Despite significant advancements in XAI, scholars note a persistent lack of solid conceptual foundations and integration with broader scientific discourse on explanation. In response, emerging XAI research draws on explanatory strategies from various sciences and philosophy of science literature to fill these gaps. This paper outlines a mechanistic strategy for explaining the functional organization of deep learning systems, situating recent advancements in AI explainability within a broader philosophical context. According to the mechanistic approach, the explanation of opaque AI systems involves identifying mechanisms that drive decision-making. For deep neural net-works, this means discerning functionally relevant components—such as neurons, layers, circuits, or activation patterns—and understanding their roles through decomposition, localization, and re-composition. Proof-of-principle case studies from image recognition and language modeling align these theoretical approaches with the latest research from AI labs like OpenAI and Anthropic. This research suggests that a systematic approach to studying model organization can reveal elements that simpler (or “more modest”) explainability techniques might miss, fostering more thoroughly explainable AI. The paper concludes with a discussion on the epistemic relevance of the mechanistic approach positioned in the context of selected philosophical debates on XAI.

Export/Citation:

Social Networking:

Share |

Item Type:

Preprint

Creators:

Creators	Email	ORCID
Rabiza, Marcin	marcin.rabiza@gssr.edu.pl	0000-0001-6217-6149

Additional Information:

Forthcoming in Müller, V. C., Dewey, A. R., Dung, L., & Löhr, G. (Eds.), Philosophy of Artificial Intelligence: The State of the Art, Synthese Library, Berlin: Springer Nature. Please cite the published version.

Keywords:

black box problem, explainable artificial intelligence (XAI), explainability, interpretability, mechanisms, mechanistic explanation, mechanistic interpretability, new mechanism

Subjects:

Specific Sciences > Artificial Intelligence > AI and Ethics
General Issues > Causation
Specific Sciences > Cognitive Science > Computation
Specific Sciences > Computer Science
Specific Sciences > Artificial Intelligence
General Issues > Explanation
Specific Sciences > Artificial Intelligence > Machine Learning
General Issues > Models and Idealization

Depositing User:

Mr. Marcin Rabiza

Date Deposited:

27 Jan 2025 14:20

Last Modified:

27 Jan 2025 14:20

Item ID:

24621

Subjects:

Date:

2 November 2024

URI:

https://philsci-archive.pitt.edu/id/eprint/24621

Available Versions of this Item

A Mechanistic Explanatory Strategy for XAI. (deposited 03 Nov 2024 12:48)

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

View Item

Search & Browse

Information

A Mechanistic Explanatory Strategy for XAI

Abstract

Available Versions of this Item

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

ULS D-Scribe

E-Prints

Share

Feeds

Get Alerts for All New Posts