Rabiza, Marcin (2024) A Mechanistic Explanatory Strategy for XAI. [Preprint]
This is the latest version of this item.
![]() |
Text
Rabiza - Mechanistic XAI v4 (preprint).pdf Download (557kB) |
Abstract
Despite significant advancements in XAI, scholars continue to note a persistent lack of robust conceptual foundations and integration with broader discourse on scientific explanation. In response, emerging XAI research increasingly draws on explanatory strategies from various scientific disciplines and the philosophy of science to address these gaps. This paper outlines a mechanistic strategy for explaining the functional organization of deep learning systems, situating recent developments in AI explainability within a broader philosophical context. According to the mechanistic approach, explaining opaque AI systems involves identifying the mechanisms underlying decision-making processes. For deep neural networks, this means discerning functionally relevant components — such as neurons, layers, circuits, or activation patterns — and understanding their roles through decomposition, localization, and recomposition. Proof-of-principle case studies from image recognition and language modeling align this theoretical framework with recent research from OpenAI and Anthropic. The findings suggest that pursuing mechanistic explanations can uncover elements that traditional explainability techniques may overlook, ultimately contributing to more thoroughly explainable AI.
Export/Citation: | EndNote | BibTeX | Dublin Core | ASCII/Text Citation (Chicago) | HTML Citation | OpenURL |
Social Networking: |
Available Versions of this Item
Monthly Views for the past 3 years
Monthly Downloads for the past 3 years
Plum Analytics
Actions (login required)
![]() |
View Item |