Imputation Methods in Time Series with a Trend and a Consecutive Missing Value Pattern
Keywords:
Missing values, imputation method, consecutive missing values, time seriesAbstract
Time series with missing values can occur in almost any domain of applied sciences, and ignoring missing values, especially for a large consecutive pattern of missing values, can lead to a loss of efficiency and unreliable results. Applying an appropriate imputation method can replace the missing values with substituted ones and lead to more accurate forecasting. However, the appropriate imputation method depends on the type of time series and the missing data pattern. The focus of this study is on time-series types with a trend when consecutive missing values are apparent. Ten real datasets were used to evaluate the performances of imputation methods with three scenarios of missing artificial data sequences in a time series of 10%, 20%, and 50%. The performances of six approaches for imputing missing values: interpolation, Kalman, moving average (MA), last observation carried forward (LOCF), mean, and linear trend at point (LTP) were compared in terms of root-mean-squared error (RMSE) and mean-absolute-percentage error (MAPE). The performances of the Interpolation, Kalman, and LTP were far superior to the other three imputation methods in the order of 80% on average relative to the Mean imputation method and 30-60% on average relative to the LOCF and MA imputation methods. Hence, the interpolation, Kalman, and LTP methods from this study are appropriate for imputing consecutive missing values for time-series data exhibiting a trend.