PhilSci Archive

The Big Data razor

López-Rubio, Ezequiel (2020) The Big Data razor. [Preprint]

[img]
Preview
Text
MLSimplicity_EJPS_Preprint.pdf - Accepted Version

Download (209kB) | Preview

Abstract

Classic conceptions of model simplicity for machine learning are mainly based on the analysis of the structure of the model. Bayesian, Frequentist, information theoretic and expressive power concepts are the best known of them, which are reviewed in this work, along with their underlying assumptions and weaknesses. These approaches were developed before the advent of the Big Data deluge, which has overturned the importance of structural simplicity. The computational simplicity concept is presented, and it is argued that it is more encompassing and closer to actual machine learning practices than the classic ones. In order to process the huge datasets which are commonplace nowadays, the computational complexity of the learning algorithm is the decisive factor to assess the viability of a machine learning strategy, while the classic accounts of simplicity play a surrogate role. Some of the desirable features of computational simplicity derive from its reliance on the learning system concept, which integrates key aspects of machine learning that are ignored by the classic concepts. Moreover, computational simplicity is directly associated with energy efficiency. In particular, the question of whether the maximum possibly achievable predictive accuracy should be attained, no matter the economic cost of the associated energy consumption pattern, is considered.


Export/Citation: EndNote | BibTeX | Dublin Core | ASCII/Text Citation (Chicago) | HTML Citation | OpenURL
Social Networking:
Share |

Item Type: Preprint
Creators:
CreatorsEmailORCID
López-Rubio, Ezequielezeqlr@lcc.uma.es0000-0001-8231-5687
Keywords: model simplicity, machine learning, Bayesianism, information theory, energy efficiency
Subjects: General Issues > Data
Specific Sciences > Computer Science
Specific Sciences > Artificial Intelligence
Specific Sciences > Artificial Intelligence > Machine Learning
Specific Sciences > Probability/Statistics
General Issues > Technology
Depositing User: Prof. Ezequiel López-Rubio
Date Deposited: 27 Mar 2020 01:33
Last Modified: 27 Mar 2020 01:33
Item ID: 17027
Official URL: https://dx.doi.org/10.1007/s13194-020-00288-8
DOI or Unique Handle: 10.1007/s13194-020-00288-8
Subjects: General Issues > Data
Specific Sciences > Computer Science
Specific Sciences > Artificial Intelligence
Specific Sciences > Artificial Intelligence > Machine Learning
Specific Sciences > Probability/Statistics
General Issues > Technology
Date: March 2020
URI: https://philsci-archive.pitt.edu/id/eprint/17027

Monthly Views for the past 3 years

Monthly Downloads for the past 3 years

Plum Analytics

Altmetric.com

Actions (login required)

View Item View Item