Kästner, Lena and Crook, Barnaby (2023) Explaining AI Through Mechanistic Interpretability. [Preprint]
|
Text
Explaining AI through Mechanistic Interpretability_final_v2.pdf Download (382kB) | Preview |
Abstract
Recent work in explainable artificial intelligence (XAI) attempts to render opaque AI systems understandable through a divide-and-conquer strategy. However, this fails to illuminate how trained AI systems work as a whole. Precisely this kind of functional understanding is needed, though, to satisfy important societal desiderata such as safety. To remedy this situation, we argue, AI researchers should seek mechanistic interpretability, viz. apply coordinated discovery strategies familiar from the life sciences to uncover the functional organisation of complex AI systems. Additionally, theorists should accommodate for the unique costs and benefits of such strategies in their portrayals of XAI research.
Export/Citation: | EndNote | BibTeX | Dublin Core | ASCII/Text Citation (Chicago) | HTML Citation | OpenURL |
Social Networking: |
Item Type: | Preprint | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Creators: |
|
|||||||||
Keywords: | AI, ANN, deep learning, discovery, explanation, mechanistic interpretability, XAI | |||||||||
Subjects: | General Issues > Data Specific Sciences > Cognitive Science > Computation Specific Sciences > Computer Science General Issues > Explanation General Issues > Models and Idealization Specific Sciences > Neuroscience General Issues > Philosophers of Science |
|||||||||
Depositing User: | Dr. Lena Kästner | |||||||||
Date Deposited: | 08 Nov 2023 18:37 | |||||||||
Last Modified: | 08 Nov 2023 18:37 | |||||||||
Item ID: | 22747 | |||||||||
Subjects: | General Issues > Data Specific Sciences > Cognitive Science > Computation Specific Sciences > Computer Science General Issues > Explanation General Issues > Models and Idealization Specific Sciences > Neuroscience General Issues > Philosophers of Science |
|||||||||
Date: | 3 November 2023 | |||||||||
URI: | https://philsci-archive.pitt.edu/id/eprint/22747 |
Monthly Views for the past 3 years
Monthly Downloads for the past 3 years
Plum Analytics
Actions (login required)
View Item |