Imputation Methods for Multiple Regression with Missing Heteroscedastic Data

Authors

  • Muhammad Asif Department of Economics, Lasbela University of Agriculture, Water and Marine Sciences, Uthal, Balochistan, Pakistan
  • Klairung Samart Division of Computational Science, Faculty of Science, Prince of Songkla University, Songkhla, Thailand

Keywords:

Missing data, imputation, equivalent weight, bias, mean squared error

Abstract

The purpose of this research is to compare the efficiency of different imputation methods for multiple regression analysis of heteroscedastic data with missing at random dependent variable. The missing data imputation methods used in this study are mean imputation, hot deck imputation, knearest neighbors imputation (KNN), stochastic regression imputation, along with three proposed composite methods, namely hot deck and KNN imputation with equivalent weight (HKEW), hot deck and stochastic regression imputation with equivalent weight (HSEW), and mean and stochastic regression imputation with equivalent weight (MSEW). The comparison between the seven methods was conducted through the simulation study varied by the sample sizes and the missing percentages. The criteria for comparing the efficiency of estimators are bias and mean squared error (MSE). The results show that the stochastic regression imputation performed well in terms of bias in all situations. In terms of MSE, the mean imputation performed well when the sample size is small to medium, whereas the MSEW imputation performed well when the sample size is large and the missing percentage is high (30-40%).

Downloads

Published

2021-12-30

How to Cite

Asif, M. ., & Samart, K. . (2021). Imputation Methods for Multiple Regression with Missing Heteroscedastic Data. Thailand Statistician, 20(1), 1–15. Retrieved from https://ph02.tci-thaijo.org/index.php/thaistat/article/view/245845

Issue

Section

Articles