Random Subspace Method for high-dimensional regression with the R package regRSM
Robert Kłopotek , Jan Mielniczuk , Paweł Roman Teisseyre
AbstractModel selection and variable importance assessment in high-dimensional regression are among the most important tasks in contemporary applied statistics. In our procedure, implemented in the package regRSM, the Random Subspace Method (RSM) is used to construct a variable importance measure. The variables are ordered with respect to the measures computed in the first step using the RSM and then, from the hierarchical list of models given by the ordering, the final subset of variables is chosen using information criteria or validation set. Modifications of the original method such as the weighted Random Subspace Method and the version with initial screening of redundant variables are discussed. We developed parallel implementations which enable to reduce the computation time significantly. In this paper, we give a brief overview of the methodology, demonstrate the package’s functionality and present a comparative study of the proposed algorithm and the competitive methods like lasso or CAR scores. In the performance tests the computational times for parallel implementations are compared.
|Journal series||Computational Statistics, ISSN 0943-4062|
|Publication size in sheets||1.45|
|Keywords in English||Random Subspace Method, High-dimensional regression, Variable importance measure, Generalized Information Criterion, MPI, R|
|ASJC Classification||; ;|
|Abstract in Polish||W pracy przedstawiono implementację metody Random Subspace Method w pakiecie regRSM. Omówiono dwa nowe warianty metody oraz zaimplementowaną wersję zrównolegloną.|
|Score|| = 15.0, 01-01-2020, ArticleFromJournal|
= 20.0, 01-01-2020, ArticleFromJournal
|Publication indicators||= 3; : 2016 = 0.952; : 2016 = 0.434 (2) - 2016=0.71 (5)|
* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.