QUICK REVIEW

[Paper Review] The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning

Micah Goldblum, Marc Finzi|arXiv (Cornell University)|Apr 11, 2023

Computability, Logic, AI Algorithms14 citations

TL;DR

The paper derives a Kolmogorov-complexity-based no free lunch theorem, demonstrates that real-world data and neural networks prefer low-complexity solutions, and argues for unified learning via inductive biases and PAC-Bayes bounds across domains.

ABSTRACT

No free lunch theorems for supervised learning state that no learner can solve all problems or that all learners achieve exactly the same accuracy on average over a uniform distribution on learning problems. Accordingly, these theorems are often referenced in support of the notion that individual problems require specially tailored inductive biases. While virtually all uniformly sampled datasets have high complexity, real-world problems disproportionately generate low-complexity data, and we argue that neural network models share this same preference, formalized using Kolmogorov complexity. Notably, we show that architectures designed for a particular domain, such as computer vision, can compress datasets on a variety of seemingly unrelated domains. Our experiments show that pre-trained and even randomly initialized language models prefer to generate low-complexity sequences. Whereas no free lunch theorems seemingly indicate that individual problems require specialized learners, we explain how tasks that often require human intervention such as picking an appropriately sized model when labeled data is scarce or plentiful can be automated into a single learning algorithm. These observations justify the trend in deep learning of unifying seemingly disparate problems with an increasingly small set of machine learning models.

Motivation & Objective

Motivate induction in ML and connect it to real-world data structure versus uniform-noise assumptions in NFL theorems.
Derive a Kolmogorov-complexity-based NFL theorem to explain why learning is feasible in practice.
Demonstrate that real datasets and neural networks exhibit a low-complexity bias across domains.
Show how cross-domain PAC-Bayes bounds can explain generalization and support a unified learning approach.

Proposed method

Derive a new NFL theorem using incompressibility via Kolmogorov complexity.
Use compression (e.g., bzip2) to bound K(x) and K(Y|X) for datasets.
Express K(Y|X) in terms of negative log-likelihood and model size to show compression implies learnability.
Demonstrate a simplicity bias in neural networks by compressing labels across tabular and image domains.
Apply a simple Kolmogorov-based language to measure complexity of generated sequences (for GPT-3).
Reshape tabular data into images to test cross-domain generalization bounds with CNNs.
Present PAC-Bayes-style generalization bounds tied to dataset compressibility and marginal likelihood.

Experimental results

Research questions

RQ1Do real-world datasets exhibit compressibility that explains successful ML generalization despite NFL theorems?
RQ2Do neural networks and large language models prefer low-Kolmogorov-complexity solutions across domains?
RQ3Can cross-domain PAC-Bayes bounds account for generalization when models are used beyond their native domain (e.g., CNNs on tabular data)?

Key findings

Real datasets are highly compressible, in contrast to uniformly random data which is incompressible.
Neural networks compress the labeling function, implying a non-trivial K(Y|X) bound linked to model likelihood.
There exists a Kolmogorov-style NFL theorem showing learning is possible on compressible data and impossible on incompressible data.
GPT-3 and larger models assign exponentially higher probability to simpler sequences (low Kolmogorov complexity).
CNNs trained on artificially encoded tabular data generalize well due to a strong simplicity bias, as shown by PAC-Bayes compression bounds.
A single model family can perform well across diverse problems, aligning with a low-complexity inductive bias and reducing need for domain-specific models.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.