Document Type : Original Research Paper


Faculty of Electrical and Computer, Malek Ashtar University of Technology, Tehran, Iran


Background and Objectives: Action recognition, as the processes of labeling an unknown action of a query video, is a challenging problem, due to the event complexity, variations in imaging conditions, and intra- and inter-individual action-variability. A number of solutions proposed to solve action recognition problem. Many of these frameworks suppose that each video sequence includes only one action class. Therefore, we need to break down a video sequence into sub-sequences, each containing only a single action class.
Methods: In this paper, we develop an unsupervised action change detection method to detect the time of actions change, without classifying the actions. In this method, a silhouette-based framework will be used for action representation. This representation uses xt patterns. The xt pattern is a selected frame of xty volume. This volume is achieved by rotating the traditional space-time volume and displacing its axes. In xty volume, each frame consists of two axes (x) and time (t), and y value specifies the frame number.
Results: To test the performance of the proposed method, we created 105 artificial videos using the Weizmann dataset, as well as time-continuous camera-captured video. The experiments have been conducted on this dataset. The precision of the proposed method was 98.13% and the recall was 100%.
Conclusion: The proposed unsupervised approach can detect action changes with a high precision. Therefore, it can be useful in combination with an action recognition method for designing an integrated action recognition system.

©2020 The author(s). This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, as long as the original authors and source are cited. No permission is required from the authors or the publishers.


Main Subjects

[1] K. Guo, P. Ishwar, J. Konrad, "Action recognition from video using feature covariance matrices," IEEE Transactions on Image Processing, 22(6): 2479-2494, 2013. 

[2] K. Guo, Action recognition using log-covariance matrices of silhouette and optical-flow features. Boston University, 2012.

[3] S.-R. Ke, H. Thuc, Y.-J. Lee, J.-N. Hwang, J.-H. Yoo, and K.-H. Choi, "A review on video-based human activity recognition," 2(2): 88-131, 2013.

[4] Z. Weng and Y. J. J. o. E. I. Guan, "Trajectory-aware three-stream CNN for video action recognition," 28(2): 021004, 2018.

[5] H. Wang, A. Kläser, C. Schmid, C.-L. J. I. j. o. c. v. Liu, "Dense trajectories and motion boundary descriptors for action recognition," 103(1): 60-79, 2013.

[6] M. Ristivojevic, J. J. I. T. o. I. P. Konrad, "Space-time image sequence analysis: object tunnels and occlusion volumes," 15(2): 364-376, 2006. 

[7] Y. Pritch, A. Rav-Acha, S. J. I. T. o. P. A. Peleg, M. Intelligence, "Nonchronological video synopsis and indexing," 11: 1971-1984, 2008. 

[8] J. J. I. C. m. Konrad, "Videopsy: Dissecting visual data in space-time," 45(1): 34-42, 2007. 

[9] M. Blank, L. Gorelick, E. Shechtman, M. Irani, R. Basri, "Actions as space-time shapes," in null, IEEE: 1395-1402, 2005.   

[10] D. K. Vishwakarma, R. J. E. S. w. A. Kapoor, "Hybrid classifier based human activity recognition using the silhouette and cells," 42 (20): 6957-6965, 2015.

[11] N. Amraji, L. Mu, M. Milanova, "Shape–based human actions recognition in videos," in International Conference on Human-Computer Interaction, Springer: 539-546, 2011.

[12] A. F. Bobick, J. W. Davis, "The recognition of human movement using temporal templates," IEEE Transactions on pattern analysis, 23(3): 257-267, 2001. 

[13] M. Sharif, Muhammad Attique Khan, Farooq Zahid, Jamal Hussain Shah, Tallha Akram., "Human action recognition: a framework of statistical weighted segmentation and rank correlation-based selection," Pattern Analysis and Applications): 281-294, 2020.

[14] C. C. A. Chen, J, "Recognizing human action from a far field of view," Proceedings of the 2009Workshop on Motion and Video Computing (WMVC)): 1–7, December 2009.  

[15] S. Sehgal, "Human Activity Recognition Using BPNN Classifier on HOG Features," In Proceedings of the 2018 International Conference on Intelligent Circuits and Systems (ICICS),Phagwara, India): 286–289, 2018. 

[16] M. A. Khan, Tallha Akram, Muhammad Sharif, Nazeer Muhammad, Muhammad Younus Javed, Syed Rameez Naqvi, "Improved strategy for human action recognition; experiencing a cascaded design," IET Image Processing: 818-829., 2019.  

[17] N. Dalal, B. Triggs, "Histograms of oriented gradients for human detection," in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 2005, 1: IEEE): 886-893.  

[18] Y. Zhu, W. Chen, G. J. I. Guo, V. Computing, "Evaluating spatiotemporal interest point features for depth-based action recognition," 32(8): 453-464, 2014.  

[19] J. C. Niebles, H. Wang, L. J. I. j. o. c. v. Fei-Fei, "Unsupervised learning of human action categories using spatial-temporal words," 79(3): 299-318, 2008.  

[20] L. Zhang, R. Khusainov, J. Chiverton, "Practical action recognition with manifold regularized sparse representation," in 29th British Machine Vision Conference: BMVC 2018, 2018: British Machine Vision Association, 2018.

[21] M. A. Khan, Kashif Javed, Sajid Ali Khan, Tanzila Saba, Usman Habib, Junaid Ali Khan, Aaqif Afzaal Abbasi. , "Human action recognition using fusion of multiview and deep features: an application to video surveillance," Multimedia Tools and Applications): 1-27, 2020.  

[22]N. Hussain, Muhammad Attique Khan, Muhammad Sharif, Sajid Ali Khan, Abdulaziz A. Albesher, Tanzila Saba, Ammar Armaghan, "A deep neural network and classical features based scheme for objects recognition: an application for machine inspection," Multimed Tools Application, 2020,  

[23]H. Arshad, Muhammad Attique Khan, Muhammad Irfan Sharif, Mussarat Yasmin, João Manuel RS Tavares, Yu‐Dong Zhang, Suresh Chandra Satapathy, "A multilevel paradigm for deep convolutional neural network features selection with an application to human gait recognition," Expert Systems 2020.    

[24] N. M. Oliver, B. Rosario, A. P. J. I. t. o. p. a. Pentland, m. intelligence, "A Bayesian computer vision system for modeling human interactions," 22, (8): 831-843, 2000.

[25] Y. Zhang et al., "Modeling temporal interactions with interval temporal bayesian networks for complex activity recognition," 35(10): 2468-2483, 2013. 

[26] F. Negin, F. J. I. T. R. Bremond, "Human action recognition in videos: A survey," 2016.

[27] W. Ding, K. Liu, X. Fu, F. Cheng, "Profile HMMs for skeleton-based human action recognition," Signal Processing: Image Communication, 42: 109-119, 2016.  

[28] Y. Zhou, A. J. P. R. L. Ming, "Human action recognition with skeleton induced discriminative approximate rigid part model," 83: 261-267, 2016.

 [29] B. Saghafi, D. Rajan, W. J. P. A. Li, Applications, "Efficient 2D viewpoint combination for human action recognition," 19(2): 563-577, 2016.

[30] S. Das, M. Koperski, F. Bremond, G. Francesca, "A Fusion of Appearance based CNNs and Temporal evolution of Skeleton with LSTM for Daily Living Action Recognition," 2018.

[31] Z. Liu Z. Wang, "Action recognition with low observational latency via part movement model," 76(24): 26675-26693, 2017.   

[32] S. Sempena, N. U. Maulidevi, P. R. Aryan, "Human action recognition using dynamic time warping," in Electrical Engineering and Informatics (ICEEI), 2011 International Conference on, 2011: IEEE: 1-5, 2011.   

[33] S.-R. Ke, "Recognition of Human Actions based on 3D Pose Estimation via Monocular Video Sequences," 2015.

[34] D. C. Luvizon, H. Tabia, D. J. P. R. L. Picard, "Learning features combination for human action recognition from skeleton sequences," 99: 13-20, 2017.

[35] M. Hoai, Z.-Z. Lan, F. De la Torre, "Joint segmentation and classification of human actions in video," in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE: 3265-3272, 2011.

[36] M. Basseville, I. V. Nikiforov, Detection of abrupt changes: theory and application. Prentice Hall Englewood Cliffs, 1993.

[37] L. Gorelick, M. Blank, E. Shechtman, M. Irani, R. J. I. t. o. p. a. Basri, and m. intelligence, "Actions as space-time shapes," 29(12): 2247-2253, 2007. 

[38] K. Guo, P. Ishwar, J. Konrad, "Action recognition in video by covariance matching of silhouette tunnels," in XXII Brazilian Symposium on Computer Graphics and Image Processing, IEEE: 299-306, 2009.   

[39] A. Elgammal, R. Duraiswami, D. Harwood, L. S. J. P. o. t. I. Davis, "Background and foreground modeling using nonparametric kernel density estimation for visual surveillance," 90(7): 1151-1163, 2002. .


Journal of Electrical and Computer Engineering Innovations (JECEI) welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in JECEI should be sent to the editorial office of JECEI within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.

[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.