An Efficient Algorithm For Weak Hierarchical Lasso

作者:Liu Yashu*; Wang Jie*; Ye Jieping*
来源:ACM Transactions on Knowledge Discovery from Data, 2016, 10(3): 32.
DOI:10.1145/2791295

摘要

Linear regression is a widely used tool in data mining and machine learning. In many applications, fitting a regression model with only linear effects may not be sufficient for predictive or explanatory purposes. One strategy that has recently received increasing attention in statistics is to include feature interactions to capture the nonlinearity in the regression model. Such model has been applied successfully in many biomedical applications. One major challenge in the use of such model is that the data dimensionality is significantly higher than the original data, resulting in the small sample size large dimension problem. Recently, weak hierarchical Lasso, a sparse interaction regression model, is proposed that produces a sparse and hierarchical structured estimator by exploiting the Lasso penalty and a set of hierarchical constraints. However, the hierarchical constraints make it a non-convex problem and the existing method finds the solution to its convex relaxation, which needs additional conditions to guarantee the hierarchical structure. In this article, we propose to directly solve the non-convex weak hierarchical Lasso by making use of the General Iterative Shrinkage and Thresholding (GIST) optimization framework, which has been shown to be efficient for solving non-convex sparse formulations. The key step in GIST is to compute a sequence of proximal operators. One of our key technical contributions is to show that the proximal operator associated with the non-convex weak hierarchical Lasso admits a closed-form solution. However, a naive approach for solving each subproblem of the proximal operator leads to a quadratic time complexity, which is not desirable for large-size problems. We have conducted extensive experiments on both synthetic and real datasets. Results show that our proposed algorithm is much more efficient and effective than its convex relaxation. To this end, we further develop an efficient algorithm for computing the subproblems with a linearithmic time complexity. In addition, we extend the technique to perform the optimization-based hierarchical testing of pairwise interactions for binary classification problems, which is essentially the proximal operator associated with weak hierarchical Lasso. Simulation studies show that the non-convex hierarchical testing framework outperforms the convex relaxation when a hierarchical structure exists between main effects and interactions.

  • 出版日期2016-2

全文