The Accuracy Comparison of Time Series Classification in Vector Space between SAX and BOSS Methods: A Case Study of Electrocardiogram
Main Article Content
Abstract
The electrocardiogram (ECG) is an important procedure used to diagnose heart disorders. However, the ECG may contain different types of noise due to various of factors, potentially resulting in diagnostic errors. This research compares Symbolic Aggregate Approximation in Vector Space (SAXVSM) and Bag of Symbolic Fourier Approximation Symbols in Vector Space (BOSSVS) methods for classifying ECG data with noise. To choose a suitable classification algorithm for ECG5000 dataset, which is available in the Physionet database. Four types of ECG noises were simulated and then added to the data as follow: 1) Electromyography (EMG) 2) Powerline Interference 3) Baseline Wander and 4) Composite at 25%, 50% and 100% levels for the performance comparison of the ECG classification between normal and abnormal heart rhythms with SAXVSM and BOSSVS. The results show that both algorithms have similar high performance for all 13 datasets: accuracy and F1 score are 97-99%, precision is 95-99%, and recall is 97-100%, but BOSSVS has a longer running time than SAXVSM.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
I/we certify that I/we have participated sufficiently in the intellectual content, conception and design of this work or the analysis and interpretation of the data (when applicable), as well as the writing of the manuscript, to take public responsibility for it and have agreed to have my/our name listed as a contributor. I/we believe the manuscript represents valid work. Neither this manuscript nor one with substantially similar content under my/our authorship has been published or is being considered for publication elsewhere, except as described in the covering letter. I/we certify that all the data collected during the study is presented in this manuscript and no data from the study has been or will be published separately. I/we attest that, if requested by the editors, I/we will provide the data/information or will cooperate fully in obtaining and providing the data/information on which the manuscript is based, for examination by the editors or their assignees. Financial interests, direct or indirect, that exist or may be perceived to exist for individual contributors in connection with the content of this paper have been disclosed in the cover letter. Sources of outside support of the project are named in the cover letter.
I/We hereby transfer(s), assign(s), or otherwise convey(s) all copyright ownership, including any and all rights incidental thereto, exclusively to the Journal, in the event that such work is published by the Journal. The Journal shall own the work, including 1) copyright; 2) the right to grant permission to republish the article in whole or in part, with or without fee; 3) the right to produce preprints or reprints and translate into languages other than English for sale or free distribution; and 4) the right to republish the work in a collection of articles in any other mechanical or electronic format.
We give the rights to the corresponding author to make necessary changes as per the request of the journal, do the rest of the correspondence on our behalf and he/she will act as the guarantor for the manuscript on our behalf.
All persons who have made substantial contributions to the work reported in the manuscript, but who are not contributors, are named in the Acknowledgment and have given me/us their written permission to be named. If I/we do not include an Acknowledgment that means I/we have not received substantial contributions from non-contributors and no contributor has been omitted.
References
Thailand Online Hospital. "การตรวจคลื่นไฟฟ้าหัวใจ (Electrocardiography)," 2021. [Online]. Available: https://bit.ly/3EiRCUa. [Accessed April 15, 2021].
J. Lin, E. Keogh, L. Wei, and S. Lonardi, "Experiencing SAX: a novel symbolic representation of time series," Data Mining and Knowledge Discovery, vol. 15, no. 2, pp. 107-144, doi: 10.1007/s10618-007-0064-z, 2007.
P. Senin and S. Malinchik, "Sax-vsm: Interpretable time series classification using sax and vector space model," in 2013 IEEE 13th international conference on data mining, IEEE, pp. 1175-1180, 2013.
P. Schäfer, "Bag-Of-SFA-Symbols in Vector Space (BOSS VS)," Zuse-Institut Berlin (ZIB), 2015.
P. Schäfer, "Scalable time series similarity search for data analytics," doi: 10.18452/17338, 2015.
P. Schäfer, "The BOSS is concerned with time series classification in the presence of noise," Data Mining and Knowledge Discovery, vol. 29, no. 6, pp. 1505-1530, doi: 10.1007/s10618-014-0377-7, 2014.
Y. Kaya and H. Pehlivan, "Classification of premature ventricular contraction in ECG," Int J Adv Comput Sci Appl, vol. 6, no. 7, pp. 34-40, doi: 10.14569/IJACSA.2015.060706, 2015.
K. M. Chang, "Arrhythmia ECG noise reduction by ensemble empirical mode decomposition," Sensors (Basel), vol. 10, no. 6, pp. 6063-80, doi: 10.3390/s100606063, 2010.
P. Senin. "Z-normalization of time series," 2021. [Online]. Available: https://jmotif.github.io/saxvsm_site/morea/algorithm/znorm.html. [Accessed April 21, 2021].
Souspace. "รู้จัก Discrete Fourier Transform และ Fast Fourier Transform," 2021. [Online]. Available: https://www.youtube.com/watch?v=a03-5mr5yn4. [Accessed April 21, 2021].
V. V. Raghavan and S. M. Wong, "A critical analysis of vector space model for information retrieval," Journal of the American Society for information Science, vol. 37, no. 5, pp. 279-287, doi: 10.1002/(SICI)1097-4571(198609)37:5<279::AID-ASI1>3.0.CO;2-Q, 1986.
N. Srikong. "Cosine similarity," 2021. [Online]. Available: https://medium.com/@srikong_n/cosine-similarity-f1f9a962ddc5 [Accessed April 21, 2021].