Performance Comparison of Penalized Regression Methods in Poisson Regression under High-Dimensional Sparse Data with Multicollinearity
Keywords:
Ridge regression, LASSO, adaptive LASSO, Monte Carlo simulationAbstract
Ridge regression, least absolute shrinkage and selection operator (LASSO), and adaptive LASSO can be employed for fitting high-dimensional count data by using the Poisson model. However, the performance of these statistical models has not been explicitly studied under the condition of sparse data with a multicollinearity problem. Thus, this paper aims to study the performance and compare ridge regression, LASSO and adaptive LASSO by using the criteria of median prediction mean square error (mPMSE), False Negative Rate (FNR), and False Positive Rate (FPR). The correlation structures of constant, Toeplitz, and hub Toeplitz are considered. Monte Carlo simulations with 1,000 iterations were performed to achieve the goal. The results showed that adaptive LASSO produced the lowest mPMSE. When the correlation was higher, ridge regression had the lowest mPMSE. Two criteria of incorrect variable selection were analyzed (FNR and FPR). In terms of FNR, LASSO performed better than adaptive LASSO. In terms of FPR, the opposite was true. We carried out simulations to examine the performance of the mPMSE for ridge regression, LASSO, and adaptive LASSO. We also compared the variable selection of LASSO and adaptive LASSO, using two criteria of incorrect variable selection (FNR and FPR).