The Agnostic Structure of Data Science Methods

Napoletani, Domenico and Panza, Marco and Struppa, Daniele (2018) The Agnostic Structure of Data Science Methods. In: UNSPECIFIED.

This is the latest version of this item.

Preview

Text
TheAgnosticStructure.pdf - Updated Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (319kB) | Preview

Abstract

In this paper we argue that data science is a coherent approach to empirical problems that, in its most general form, does not build understanding about phenomena. We start by exploring the broad structure of mathematization methods in data science, organized around the belief that if enough and sufficiently diverse data are collected regarding a certain phenomenon, it is possible to answer all relevant questions about it. We call this belief `the microarray paradigm’ and the approach to empirical phenomena based on it `agnostic science'. Not all computational methods dealing with large data sets are properly within the domain of agnostic science, and we give an example of an algorithm, PageRank, that relies on large data processing, but such that the significance of its output is readily intelligible. Within the new type of mathematization at work in agnostic science, mathematical methods are not selected because of any particular relevance for a problem at hand. Rather, mathematical methods are applied to a specific problem only on the basis of their ability to reorganize the data for further analysis and the intrinsic richness of their mathematical structure. We refer to this type of mathematization as `forcing’. We then show that optimization methods are used in data science by forcing them on problems. This is particularly significant since virtually all methods of data science can be reinterpreted as types of optimization methods. In particular, we argue that deep learning neural networks are best understood within the context of forcing optimality. We finally explore the broader question of the appropriateness of data science methods in solving problems. We argue that this question should not be interpreted as a search for a correspondence between phenomena and specific solutions found by data science methods. Rather, it is the internal structure of data science methods that is open to forms of understanding. As an example, we offer an analysis of ensemble methods, where distinct data science methods are combined in the search for the solution of a problem, and we speculate on the general structure of the data sets that are most appropriate for such methods.

Export/Citation:

Social Networking:

Share |

Item Type:

Conference or Workshop Item (UNSPECIFIED)

Creators:

Creators	Email	ORCID
Napoletani, Domenico
Panza, Marco	panzam10@gmail.com	0000-0003-4131-7103
Struppa, Daniele

Additional Information:

To appear in Lato Sensu, revue de la Société de philosophie des sciences, Société de philosophie des sciences.

Keywords:

Data Analysis, Agnostic Sciences, Machine Learning

Subjects:

General Issues > Data
Specific Sciences > Mathematics > Applicability

Depositing User:

Marco Panza

Date Deposited:

11 Feb 2021 15:34

Last Modified:

11 Feb 2021 15:34

Item ID:

18707

Subjects:

General Issues > Data
Specific Sciences > Mathematics > Applicability

Date:

November 2018

URI:

https://philsci-archive.pitt.edu/id/eprint/18707

Available Versions of this Item

The Agnostic Structure of Data Science Methods. (deposited 25 Nov 2018 17:29)
- The Agnostic Structure of Data Science Methods. (deposited 15 Jan 2020 05:33)
  - The Agnostic Structure of Data Science Methods. (deposited 11 Feb 2021 15:34) [Currently Displayed]

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

View Item

Search & Browse

Information

The Agnostic Structure of Data Science Methods

Abstract

Available Versions of this Item

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Actions (login required)

ULS D-Scribe

E-Prints

Share

Feeds

Get Alerts for All New Posts