PhilSci Archive

DNNs, Dataset Statistics, and Correlation Functions

Batterman, Robert and Woodward, James (2026) DNNs, Dataset Statistics, and Correlation Functions. [Preprint]

[img] Text
DNN_DATA_CF_ARC_version for archive.pdf

Download (7MB)

Abstract

This paper argues that dataset structure is important in image recognition tasks (among other tasks). Specifically, we focus on the nature and genesis of correlational structure in the actual datasets upon which DNNs are trained. We argue that DNNs are implementing a widespread methodology in condensed matter physics and materials science that focuses on mesoscale correlation structures that live between fundamental atomic/molecular scales and continuum scales. Specifically, we argue that DNNs that are successful in image classification must be discovering high order correlation functions. It is well-known that DNNs successfully generalize in apparent contravention of standard statistical learning theory. We consider the implications of our discussion for this puzzle.


Export/Citation: EndNote | BibTeX | Dublin Core | ASCII/Text Citation (Chicago) | HTML Citation | OpenURL
Social Networking:
Share |

Item Type: Preprint
Creators:
CreatorsEmailORCID
Batterman, Robertrbatterm@pitt.edu
Woodward, Jamesjfw@pitt.edu
Additional Information: This is the latest, revised version of a paper with the same title that was previously posted on the archive.
Keywords: DNNs, statistical learning theory, overfitting, scale invariance, random matrix theory, higher order correlation functions
Subjects: Specific Sciences > Artificial Intelligence > Classical AI
General Issues > Confirmation/Induction
General Issues > Evidence
General Issues > Formal Learning Theory
Specific Sciences > Artificial Intelligence > Machine Learning
Depositing User: Jim Woodward
Date Deposited: 28 Apr 2026 20:54
Last Modified: 28 Apr 2026 20:54
Item ID: 29386
Subjects: Specific Sciences > Artificial Intelligence > Classical AI
General Issues > Confirmation/Induction
General Issues > Evidence
General Issues > Formal Learning Theory
Specific Sciences > Artificial Intelligence > Machine Learning
Date: 28 April 2026
URI: https://philsci-archive.pitt.edu/id/eprint/29386

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item