Napoletani, Domenico and Panza, Marco and Struppa, Daniele (2018) The Agnostic Structure of Data Science Methods. In: UNSPECIFIED.
This is the latest version of this item.
|
Text
TheAgnosticStructure.pdf - Updated Version Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (319kB) | Preview |
Abstract
In this paper we argue that data science is a coherent approach to empirical problems that, in its most general form, does not build understanding about phenomena. We start by exploring the broad structure of mathematization methods in data science, organized around the belief that if enough and sufficiently diverse data are collected regarding a certain phenomenon, it is possible to answer all relevant questions about it. We call this belief `the microarray paradigm’ and the approach to empirical phenomena based on it `agnostic science'. Not all computational methods dealing with large data sets are properly within the domain of agnostic science, and we give an example of an algorithm, PageRank, that relies on large data processing, but such that the significance of its output is readily intelligible. Within the new type of mathematization at work in agnostic science, mathematical methods are not selected because of any particular relevance for a problem at hand. Rather, mathematical methods are applied to a specific problem only on the basis of their ability to reorganize the data for further analysis and the intrinsic richness of their mathematical structure. We refer to this type of mathematization as `forcing’. We then show that optimization methods are used in data science by forcing them on problems. This is particularly significant since virtually all methods of data science can be reinterpreted as types of optimization methods. In particular, we argue that deep learning neural networks are best understood within the context of forcing optimality. We finally explore the broader question of the appropriateness of data science methods in solving problems. We argue that this question should not be interpreted as a search for a correspondence between phenomena and specific solutions found by data science methods. Rather, it is the internal structure of data science methods that is open to forms of understanding. As an example, we offer an analysis of ensemble methods, where distinct data science methods are combined in the search for the solution of a problem, and we speculate on the general structure of the data sets that are most appropriate for such methods.
Export/Citation: | EndNote | BibTeX | Dublin Core | ASCII/Text Citation (Chicago) | HTML Citation | OpenURL |
Social Networking: |
Item Type: | Conference or Workshop Item (UNSPECIFIED) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Creators: |
|
||||||||||||
Additional Information: | To appear in Lato Sensu, revue de la Société de philosophie des sciences, Société de philosophie des sciences. | ||||||||||||
Keywords: | Data Analysis, Agnostic Sciences, Machine Learning | ||||||||||||
Subjects: | General Issues > Data Specific Sciences > Mathematics > Applicability |
||||||||||||
Depositing User: | Marco Panza | ||||||||||||
Date Deposited: | 11 Feb 2021 15:34 | ||||||||||||
Last Modified: | 11 Feb 2021 15:34 | ||||||||||||
Item ID: | 18707 | ||||||||||||
Subjects: | General Issues > Data Specific Sciences > Mathematics > Applicability |
||||||||||||
Date: | November 2018 | ||||||||||||
URI: | https://philsci-archive.pitt.edu/id/eprint/18707 |
Available Versions of this Item
-
The Agnostic Structure of Data Science Methods. (deposited 25 Nov 2018 17:29)
-
The Agnostic Structure of Data Science Methods. (deposited 15 Jan 2020 05:33)
- The Agnostic Structure of Data Science Methods. (deposited 11 Feb 2021 15:34) [Currently Displayed]
-
The Agnostic Structure of Data Science Methods. (deposited 15 Jan 2020 05:33)
Monthly Views for the past 3 years
Monthly Downloads for the past 3 years
Plum Analytics
Actions (login required)
View Item |