QUICK REVIEW

[Paper Review] Safe Screening With Variational Inequalities and Its Application to LASSO

Jun Liu, Zheng Zhao|arXiv (Cornell University)|Jul 29, 2013

Statistical Methods and Inference16 references66 citations

TL;DR

This paper proposes Sasvi, a novel safe screening method for Lasso that leverages variational inequalities to derive a stronger, optimality-based screening rule. By using the exact dual problem's variational inequality condition, Sasvi identifies inactive features more effectively than prior safe screening methods, achieving significant computational speedups while guaranteeing no active features are mistakenly removed.

ABSTRACT

Sparse learning techniques have been routinely used for feature selection as the resulting model usually has a small number of non-zero entries. Safe screening, which eliminates the features that are guaranteed to have zero coefficients for a certain value of the regularization parameter, is a technique for improving the computational efficiency. Safe screening is gaining increasing attention since 1) solving sparse learning formulations usually has a high computational cost especially when the number of features is large and 2) one needs to try several regularization parameters to select a suitable model. In this paper, we propose an approach called "Sasvi" (Safe screening with variational inequalities). Sasvi makes use of the variational inequality that provides the sufficient and necessary optimality condition for the dual problem. Several existing approaches for Lasso screening can be casted as relaxed versions of the proposed Sasvi, thus Sasvi provides a stronger safe screening rule. We further study the monotone properties of Sasvi for Lasso, based on which a sure removal regularization parameter can be identified for each feature. Experimental results on both synthetic and real data sets are reported to demonstrate the effectiveness of the proposed Sasvi for Lasso screening.

Motivation & Objective

To develop a more effective safe screening rule for Lasso that guarantees no active features are incorrectly removed.
To leverage the variational inequality condition as the sufficient and necessary optimality criterion for the dual problem to strengthen screening rules.
To identify a 'sure removal' regularization parameter for each feature using monotonicity properties of the screening bound.
To demonstrate the superiority of the proposed method over existing safe screening rules (e.g., SAFE, DPP) and heuristic rules (e.g., strong rules) in computational efficiency and accuracy.
To extend the framework to generalized sparse linear models such as sparse logistic regression using similar variational inequality principles.

Proposed method

Derive the dual problem of the Lasso optimization and use the variational inequality condition as the exact optimality criterion for the dual variable.
Construct a feasible set for the dual variable at a smaller regularization parameter using the optimality condition from the previous solution.
Estimate an upper bound on the inner product |⟨x_j, θ*⟩| for each feature j using the feasible set, enabling safe screening if the bound is less than 1.
Establish monotonicity properties of the upper-bound function to identify a 'sure removal' regularization parameter for each feature, ensuring it remains inactive beyond that point.
Apply the screening rule iteratively along the Lasso path, discarding features that are provably inactive at each step.
Extend the method to sparse logistic regression by deriving its dual problem and applying the same variational inequality-based screening framework with approximations for tractability.

Experimental results

Research questions

RQ1Can variational inequalities be used to derive a stronger, provably correct screening rule for Lasso compared to existing safe screening methods?
RQ2How does the proposed Sasvi method improve upon relaxed screening rules like SAFE and DPP in terms of feature elimination capability?
RQ3What is the role of monotonicity in the upper-bound estimation, and how can it be exploited to identify a 'sure removal' regularization parameter for each feature?
RQ4How does Sasvi compare in computational efficiency and accuracy to heuristic screening rules such as the strong rule, especially in terms of avoiding correction steps?
RQ5Can the variational inequality-based screening framework be generalized to other sparse linear models like logistic regression?

Key findings

Sasvi significantly outperforms SAFE and DPP in computational speed, reducing average running time from 101.55 seconds (without screening) to 2.76 seconds on synthetic data with 5,000 features.
On the MNIST dataset, Sasvi reduces running time from 2,683.57 seconds to 5.02 seconds, demonstrating a speedup of over 500x compared to the baseline solver.
Sasvi achieves rejection ratios comparable to the strong rule (a heuristic method), but with the advantage of being provably correct and requiring no KKT checks or corrections.
The proposed method identifies a 'sure removal' regularization parameter for each feature, enabling early and guaranteed elimination of inactive features along the solution path.
Empirical results show that Sasvi maintains high screening accuracy while providing stronger theoretical guarantees than existing safe screening rules.
The extension to sparse logistic regression is feasible in principle, though the upper-bound estimation is more complex and may require quadratic approximation for tractability.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.