Learning Model of Human Body Movement using Convolutional Neural Network and Long Short-Term Memory

Main Article Content

Nutchanun Chinpanthana


The recognition and understanding of human activities remain active research in computer vision and it is widely used in computer interaction, human monitoring system, human behavior analysis, and other fields. The problem is challenging due to its various interactions between persons and motion dynamics between human and objects. There are many algorithms to recognize human activity based on deep neural network models was proposed. The deep neural network classifier was constructed to integrate with the human action target, but the results was not classified as action though it should have. To overcome this problem, this paper proposes a combined model of deep learning convolutional network with long short-term memory for recognize human motion pose. The proposed model extracts the key point features in an automated way with OpenPose and categorizes with trajectories of a human body. This paper presents main three steps including (1) data pre-processing and feature extraction, (2) deep learning with convolutional neural network and long short-term memory, and (3) efficiency measurement and evaluation of experimental results. Results we have tested our approach on activities of daily living with URFall datasets. The experimental results indicate that our framework offers performance improvements. The proposed model can achieve significant improvements for activity classification with maximum success rate of 81% and 76.9% with 256, 128 hidden nodes in data set II.


Download data is not yet available.

Article Details

How to Cite
Chinpanthana น., “Learning Model of Human Body Movement using Convolutional Neural Network and Long Short-Term Memory”, JIST, vol. 12, no. 1, pp. 27–36, Jun. 2022.
Academic Article: Information Systems (Detail in Scope of Journal)


C. Galleguillos and S. Belongie, “Context Based Object Categorization: A Critical Survey”, Computer Vision and Image Understanding, vol. 114, pp. 712-722, 2010.

B. Luo, Y. Sun, G. Li, D. Chen, and Z. Ju, “Decomposition algorithm for depth image of human health posture based on brain health,” Neural Computing and Applications, vol. 32, no. 10, pp. 6327–6342, 2020.

J. A. Graham, and M. Argyle, “A cross-cultural study of the communication of extra-verbal meaning by gestures”, International Journal of Psychology, vol 10, no.1, pp. 57–67, 1975.

A. Wilson, Paul and Barbara Lewandowska-Tomaszczyk, “Affective Robotics: Modelling and Testing Cultural Prototypes,” Cognitive Computation, vol 6, no.4. pp. 814, 2014.

S. Vishwakarma and A. Agrawal, “A survey on activity recognition and behavior understanding in video surveillance,” The Visual Computer, vol. 29, no.10, pp. 983-1009, 2013.

H. Mousavi Hondori and M. Khademi. A review on technical and clinical impact of Microsoft kinect on physical therapy and rehabilitation. Journal of Medical Engineering, 2014.

M. A. Jaffar, M. S. Zia, M. Hussain, A. B. Siddiqui, S. Akram, and U. Jamil, “An ensemble shape gradient features descriptor based nodule detection paradigm: a novel model to augment complex diagnostic decisions assistance,” Multimedia Tools and Applications, 2018.

L. Pigou, S. Dieleman, P.-J. Kindermans, and B. Schrauwen, “Chapter Sign Language Recognition Using Convolutional Neural Networks,” European Conference on Computer Vision, pp. 572-578, 2015.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” International Conference on Learning Representations, 2015.

Y. Wang and M. Hoai. “Improving human action recognition by non-action classification,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2698-2707, 2016.

Li J., X. Liu, W. Zhang, M. Zhang, J. Song and N. Sebe, “Spatio-Temporal Attention Networks for Action Recognition and Detection,” IEEE Transactions on Multimedia, 2020.

Thurau, C. and Hlav´aˇc, V., “n-grams of action primitives for recognizing human behavior,” Computer Analysis of Images and Patterns. Springer Berlin / Heidelberg. of Lecture Notes in Computer Science, vol. 4673, pp. 93–100. 2007.

Alexandros Andre Chaaraoui, Pau Climent-Péreza, and Francisco Flórez-Revuelta, “Silhouette-based human action recognition using sequences of key poses,” Pattern Recognition Letters, vol. 34, Issue 15, pp. 1799-1807, 2013.

S. Yi and V. Pavlovic, “Sparse granger causality graphs for human action classification,” Proceedings of the 21st International Conference on Pattern Recognition, pp. 3374-3377, 2012.

O. Ojetola, E. Gaura, and J. Brusey, “Data Set for Fall Events and Daily Activities from Inertial Sensors,” In Proceedings of the 6th ACM Multimedia Systems Conference, pp. 18–20, 2015.

F. Hussain, M. Ehatisham-ul-Haq, M.A. Azam, “Activity-Aware Fall Detection and Recognition Based on Wearable Sensors,” IEEE Sensors Journal, pp. 4528–4536, 2019.

T. Nguyen, M. Cho, and T. Lee, “Automatic Fall Detection Using Wearable Biomedical Signal Measurement Terminal,” In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 5203–5206, 2009.

M. S. Alzahrani, S. K. Jarraya, H. Ben-Abdallah, and M. S. Ali, “Comprehensive evaluation of skeleton features-based fall detection from Microsoft Kinect v2,” Signal, Image and Video Processing, vol. 13, no. 7, pp. 1431–1439, 2019.

S. Ghazal, U. S. Khan, M. Mubasher Saleem, N. Rashid, and J. Iqbal, “Human activity recognition using 2D skeleton data and supervised machine learning,” IET Image Processing, vol. 13, no. 13, pp. 2572–2578, 2019.

A. Franco, A. Magnani, and D. Maio, “A multimodal approach for human activity recognition based on skeleton and RGB data,” Pattern Recognition Letters, vol. 131, pp. 293–299, 2020.

Cao, Z.; Simon, T.; Wei, S.E.; Sheikh, Y. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299, 2017.

B. Kwolek and M. Kepski, “Human fall detection on embedded platform using depth maps and wireless Accelerometer,” Computer Methods Program Biomed, vol 117, pp. 489–501, 2014.

Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, Y. Sheikh, “OpenPose: Realtime multi-person 2D pose estimation using Part Affinity Fields”, Computer Vision and Pattern Recognition, 2018.

K. Simonyan and A. Zisserman. “Very deep convolutional networks for large-scale image recognition,” Computer Vision and Pattern Recognition, 2014.

Je Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, “Long-term recurrent convolutional networks for visual recognition and description,” Computer Vision and Pattern Recognition, 2014.