A Mechanistic Explanatory Strategy for XAI

Rabiza, Marcin (2026) A Mechanistic Explanatory Strategy for XAI. [Preprint]

This is the latest version of this item.

	Text Rabiza - Mechanistic XAI v4 (preprint).pdf - Submitted Version Download (557kB)
	Text 23-phai2023-RABIZA v4 preprint.pdf Download (618kB)

Official URL: https://link.springer.com/chapter/10.1007/978-3-03...

Abstract

Despite significant advancements in XAI, scholars note a persistent lack of solid conceptual foundations and integration with broader scientific discourse on explanation. In response, emerging research draws on explanatory strategies from various sciences and the philosophy of science literature to fill these gaps. This paper outlines a mechanistic strategy for explaining the functional organization of deep learning systems, situating recent developments in explainable AI within a broader philosophical context. According to the mechanistic approach, the explanation of opaque AI systems involves identifying mechanisms that drive decision making. For deep neural networks, this means discerning functionally relevant components, such as neurons, layers, circuits, or activation patterns, and understanding their roles through decomposition, localization, and recomposition. Proof-of-principle case studies from image recognition and language modeling align these theoretical approaches with mechanistic interpretability research from OpenAI and Anthropic. The findings suggest that pursuing mechanistic explanations can uncover elements that traditional explainability techniques may overlook, ultimately contributing to more thoroughly explainable AI.

Export/Citation:

Social Networking:

Share |

Item Type:

Preprint

Creators:

Creators	Email	ORCID
Rabiza, Marcin	marcin.rabiza@gssr.edu.pl	0000-0001-6217-6149

Additional Information:

This is a preprint of the following chapter: Rabiza, M. (2026). A mechanistic explanatory strategy for XAI. In V. C. Müller, L. Dung, G. Löhr, & A. Rumana (Eds.), Philosophy of artificial intelligence: The state of the art (pp. 389–412, Synthese Library, Vol. 533). Springer. It is the version of the author’s manuscript prior to acceptance for publication and has not undergone review on behalf of the Publisher. The final authenticated version is available online at: https://doi.org/10.1007/978-3-032-10073-3_23.

Keywords:

black box problem, explainable artificial intelligence (XAI), explainability, interpretability, mechanisms, mechanistic explanation, mechanistic interpretability, new mechanism

Subjects:

Specific Sciences > Artificial Intelligence > AI and Ethics
Specific Sciences > Computer Science
Specific Sciences > Artificial Intelligence
General Issues > Explanation
Specific Sciences > Artificial Intelligence > Machine Learning

Depositing User:

Mr. Marcin Rabiza

Date Deposited:

26 May 2026 12:31

Last Modified:

26 May 2026 12:31

Item ID:

29725

Official URL:

https://link.springer.com/chapter/10.1007/978-3-03...

DOI or Unique Handle:

https://doi.org/10.1007/978-3-032-10073-3_23

Subjects:

Date:

18 May 2026

URI:

https://philsci-archive.pitt.edu/id/eprint/29725

Available Versions of this Item

A Mechanistic Explanatory Strategy for XAI. (deposited 03 Nov 2024 12:48)

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Altmetric.com

Actions (login required)

View Item

Search & Browse

Information

A Mechanistic Explanatory Strategy for XAI

Abstract

Available Versions of this Item

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Altmetric.com

Actions (login required)

ULS D-Scribe

E-Prints

Share

Feeds

Get Alerts for All New Posts