PhilSci Archive

DNNs, Dataset Statistics, and Correlation Functions

Batterman, Robert and Woodward, James (2026) DNNs, Dataset Statistics, and Correlation Functions. [Preprint]

[img] Text
DNN_DATA_CF_sub_v2.pdf

Download (6MB)

Abstract

This paper argues that dataset structure is important in image recognition tasks (among other tasks). Specifically, we focus on the nature and genesis of correlational structure in the actual datasets upon which DNNs are trained. We argue that DNNs are implementing a widespread methodology in condensed matter physics and materials science that focuses on mesoscale correlation structures that live between fundamental atomic/molecular scales and continuum scales. Specifically, we argue that DNNs that are successful in image classification must be discovering high order correlation functions. It is well-known that DNNs successfully generalize in apparent contravention of standard statistical learning theory. We consider the implications of our discussion for this puzzle.


Export/Citation: EndNote | BibTeX | Dublin Core | ASCII/Text Citation (Chicago) | HTML Citation | OpenURL
Social Networking:
Share |

Item Type: Preprint
Creators:
CreatorsEmailORCID
Batterman, Robertrbatterm@pitt.edu
Woodward, James
Keywords: DNNs, statistical learning theory, overfitting, scale invariance, random matrix theory, higher order correlation functions
Subjects: Specific Sciences > Complex Systems
Specific Sciences > Computer Science
Specific Sciences > Physics > Condensed Matter
General Issues > Explanation
Depositing User: Jim Woodward
Date Deposited: 12 Feb 2026 12:12
Last Modified: 12 Feb 2026 12:12
Item ID: 28222
Subjects: Specific Sciences > Complex Systems
Specific Sciences > Computer Science
Specific Sciences > Physics > Condensed Matter
General Issues > Explanation
Date: 11 February 2026
URI: https://philsci-archive.pitt.edu/id/eprint/28222

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item