[Paper Review] A Precise Performance Analysis of Learning with Random Features
This paper presents a precise asymptotic analysis of learning with random features under Gaussian data, establishing exact characterizations of training and generalization errors across both under- and over-parameterized regimes. Using the uniform Gaussian equivalence conjecture, it derives closed-form expressions for error performance that hold for general feature matrices, activation functions, and convex loss functions, revealing the critical roles of regularization, loss, and activation in mitigating the double descent phenomenon.
We study the problem of learning an unknown function using random feature models. Our main contribution is an exact asymptotic analysis of such learning problems with Gaussian data. Under mild regularity conditions for the feature matrix, we provide an exact characterization of the asymptotic training and generalization errors, valid in both the under-parameterized and over-parameterized regimes. The analysis presented in this paper holds for general families of feature matrices, activation functions, and convex loss functions. Numerical results validate our theoretical predictions, showing that our asymptotic findings are in excellent agreement with the actual performance of the considered learning problem, even in moderate dimensions. Moreover, they reveal an important role played by the regularization, the loss function and the activation function in the mitigation of the "double descent phenomenon" in learning.
Motivation & Objective
- To provide an exact asymptotic characterization of training and generalization errors in random feature models under Gaussian data.
- To extend performance analysis beyond the over-parameterized regime to include under-parameterized settings.
- To investigate the interplay between regularization, loss function, and activation function in shaping generalization performance.
- To validate theoretical predictions through numerical experiments in moderate-dimensional settings.
- To establish the uniform Gaussian equivalence conjecture as a rigorous foundation for asymptotic analysis in random feature learning.
Proposed method
- Derives an asymptotically equivalent Gaussian formulation of the original random feature optimization problem, replacing the feature matrix with a Gaussian proxy.
- Applies the uniform Gaussian equivalence conjecture (uGEC) to replace the structured feature matrix with a Gaussian vector combination involving μ₀, μ₁, and μ⋆.
- Uses tools from high-dimensional probability and asymptotic analysis to characterize the limiting behavior of the optimization problem as dimension n → ∞.
- Establishes strict convexity of the asymptotic cost function in the variables q and β, ensuring uniqueness and stability of the solution.
- Employs convergence results from stochastic optimization and set deviation theory to prove almost-sure convergence of optimal solutions and costs.
- Validates theoretical predictions via numerical simulations, showing strong agreement even in moderate dimensions.
Experimental results
Research questions
- RQ1How do training and generalization errors behave asymptotically in random feature models with Gaussian data?
- RQ2To what extent does the uniform Gaussian equivalence conjecture accurately represent the performance of random feature learning?
- RQ3How do regularization, activation function, and loss function jointly influence the double descent phenomenon in generalization error?
- RQ4Can exact asymptotic expressions for generalization and training errors be derived for both under- and over-parameterized regimes?
- RQ5What is the role of the feature matrix structure in determining the asymptotic performance of random feature models?
Key findings
- The asymptotic training and generalization errors are exactly characterized in closed form for general feature matrices, activation functions, and convex loss functions.
- The proposed Gaussian formulation accurately predicts performance even in moderate dimensions, with numerical results showing excellent agreement with theory.
- Regularization is shown to play a key role in mitigating the double descent phenomenon, particularly in the over-parameterized regime.
- The choice of activation function significantly affects the generalization error, with certain nonlinearities leading to better performance.
- The loss function influences the optimization landscape and contributes to the mitigation of double descent, with convex losses enabling stable convergence.
- The asymptotic analysis confirms that the optimal solution converges in probability to the true minimizer, with convergence rates supported by theoretical guarantees.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.