A Variational Level Set Approach to Multiphase Multi-Object Tracking in Camera Network Base on Deep Features

Pazouki, E.; Rahmati, M.

doi:10.22061/jecei.2021.7649.417

Document Type : Original Research Paper

Authors

E. Pazouki ¹
M. Rahmati ²

¹ Artificial Intelligence Department, Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran

² Artificial Intelligent and Robotics Department, Faculty of Computer and Information Technology Engineering, Amirkabir University of Technology, Tehran, Iran

https://doi.org/10.22061/jecei.2021.7649.417

Abstract

Background and Objectives: Object tracking in video streams is one of the issues in machine vision that has many applications. Depending on the type of the object, the number of objects and other inputs used in tracking, object tracking is divided into several different categories. Multi-object tracking in a camera network is one of the most complex types of object tracking. In this type of tracking, the goal of the algorithm is to extract the persistent trace of several objects moving simultaneously in a wide area that is monitored by a network of cameras. This type of tracking is often done in two steps. In the first step, the traces of each object in each camera is called tracklets are extracted. Then, the persistent trace of the objects are obtained by associating the extracted tracklets of all cameras in the monitored wide area. Here, we introduce a novel variational approach based on the deep features to associate the tracklets.
Methods: For this purpose a variational model with multiphase level set representation is introduced. The persistent trace of all objects are obtained by optimizing the proposed variational model. The proposed variational model is optimized by employing the Euler-Lagrange equation. CNN and deep learning are used to extract the deep features of appearance and motion of objects. Here, a ResNet50 network that is pre-trained on ImageNet and a transformer neural network which is trained with motion parameters of tracklets such as acceleration and orientation change rate are used for extracting deep features.
Results: The results on the three well-known datasets which are real and a synthesized dataset show that the proposed model takes competitive performance, while using less extra context information of the camera network and objects, compared to the other proposed methods. The evaluations show the quality of the proposed model in solving complex problems using the minimum required initial knowledge.
Conclusion: The multiphase model using deep features presented in this paper provide 9% better results than the multiphase model without deep features based on TCF and FS metrics and 8% better results based on MT metric.

Keywords

20.1001.1.23223952.2021.9.2.8.6

Main Subjects

Object Tracking

Open Access

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/

Publisher’s Note

JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Publisher

Shahid Rajaee Teacher Training University

References

[1] A.K. Roy-Chowdhury, B. Song, Camera Networks: The Acquisition and Analysis of Videos over Wide Areas. Morgan & Claypool Publishers, 2012: 134.

[2] A. Yilmaz, O. Javed, M. Shah, "Object tracking: A survey," ACM Comput. Surv. (CSUR), 38)4(:1-45, 2006.

[3] S. Challa, Fundamentals of object tracking. Cambridge, UK; New York: Cambridge University Press, 2011.

[4] J. Bins, L.L. Dihl, C. R. Jung, "Target tracking using multiple patches and weighted vector median filters," J. Math. Imaging Vision, 45(3): 293-307, 2013.

[5] Y. Sun, L. Bentabet, "A particle filtering and DSmT based approach for conflict resolving in case of target tracking with multiple cues," J. Math. Imaging Vision, 36(2): 159-167, 2010.

[6] G. Castanon, L. Finn, "Multi-target tracklet stitching through network flows," in Proc. IEEE Aerospace Conf., 1-7, 2011.

[7] J.-F. Aujol, "Calculus of variations in image processing," in www.math.u-bordeaux1.fr/~jaujol, september 2008.

[8] A.G. Jagola, W. Yanfei, C. Yang, Computational Methods for Applied Inverse Problems. Berlin: De Gruyter, 2012.

[9] T.F. Chan, J.J.S.p. cm., Image processing and analysis: variational, PDE, wavelet, and stochastic methods. Siam: 400, 2005.

[10] G. Unal, A. Yezzi, "A variational approach to problems in calibration of multiple cameras," in Proc. of the IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR): I-172- I-178, 2004.

[11] N. Paragios, Y. Chen, O. Faugeras, Handbook of Mathematical Models in Computer Vision. Printed in the United States of America.: Springer, 2006.

[12] C. Liu, F. Dong, S. Zhu, D. Kong, K. Liu, "New variational formulations for level set evolution without reinitialization with applications to image segmentation," J. Math. Imaging Vision, 41(3): 194-209, 2011.

[13] O. Javed, Z. Rasheed, K. Shafique, M. Shah, "Tracking across multiple cameras with disjoint views," in Proc. Ninth IEEE International Conference on Computer Vision: 952-957, 2003.

[14] B. Song, A.K. Roy-Chowdhury, "Robust tracking in a camera network: A multi-objective optimization framework," IEEE IEEE J. Sel. Top. Signal Process., 2(4): 582-596, 2008.

[15] W. Hu, T. Tan, L. Wang, S. Maybank, "A survey on visual surveillance of object motion and behaviors," IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., 34(3): 334-352, 2004.

[16] D. Makris, T. Ellis, J. Black, "Bridging the gaps between cameras," in Proc. of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition: II-205- II-210, 2004.

[17] S. C, K. Tieu, "Automated multi-camera planar tracking correspondence modeling," in Proc. 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition: I-259- I-266, 2003.

[18] R. Pless et al., "Persistence and tracking: Putting vehicles and trajectories in context," in Proc. 2009 IEEE Applied Imagery Pattern Recognition Workshop (AIPRW): 1-8, 2009.

[19] C. Zhu, "Multi-Camera People Detection and Tracking," Independent thesis Advanced level (degree of Master (Two Years)) Student thesis, 2019.

[20] Y. LeCun, Y. Bengio, G. Hinton, "Deep learning," Nature, 521(7553): 436-444, 2015.

[21] H.-M. Hsu, T.-W. Huang, G. Wang, J. Cai, Z. Lei, J. Hwang, "Multi-camera tracking of vehicles based on deep features Re-ID and trajectory-based camera link models," in CVPR Workshops, 2019.

[22] G. Wang, Y. Wang, H. Zhang, R. Gu, J.-N. Hwang, "Exploit the connectivity: Multi-Object Tracking with TrackletNet," ArXiv: 1811.07258, 2018.

[23] K. He, X. Zhang, S. Ren, J. Sun, "Deep residual learning for image recognition," ArXiv:1512.03385, 2015.

[24] M.P. Ghaemmaghami, "Tracking of humans in video stream Using LSTM recurrent neural network," Master in Machine Learning, School of Computer Science And Communication, KTH Royal Institute of Technology School of Computer Science And Communication, 2019.

[25] D. Gordon, A. Farhadi, D. Fox, "Re3 : Real-time recurrent regression networks for object tracking," ArXiv: 1705.06368, 2017.

[26] P. Voigtlaender, J. Luiten, P. Torr, B. Leibe, "Siam R-CNN: visual tracking by re-detection," in Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR): 6577-6587, 2020.

[27] C. Ma et al., "Trajectory factory: Tracklet cleaving and re-connection by deep siamese bi-gru for multiple object tracking," ArXiv:1804.04555 [cs], 2018.

[28] X. Zhang, X. Wang, C. Gu, "Online multi-object tracking with pedestrian re-identification and occlusion processing," Visual Comput., 2020.

[29] N. Hussain et al., "A deep neural network and classical features based scheme for objects recognition: an application for machine inspection," Multimed. Tool. Appl., 2020: 1-23, 2020.

[30] M.A. Khan et al., "Human action recognition using fusion of multiview and deep features: an application to video surveillance," Multimed. Tool. Appl., 2020: 1-27, 2020.

[31] M. Rashid et al., "A sustainable deep learning framework for object recognition using multi-layers deep features fusion and selection," Sustainability, 12(12): 5037, 2020.

[32] M. Rashid, M.A. Khan, M. Sharif, M. Raza, M.M. Sarfraz, F. Afza, "Object detection and classification: a joint selection and fusion strategy of deep convolutional neural network and SIFT point features," Multimed. Tool. Appl., 78(12): 15751-15777, 2019.

[33] E. Pazouki, M. Rahmati, "Variational method for wide area surveillance," J. Ambient Intell. Smart Environ., 8: 189-203, 2016.

[34] E. Pazouki, M. Rahmati, "Multiphase vs. single-phase variational level set approach for video data association," Intell. Data Anal., 20: 679-699, 2016.

[35] R. Mohammdi Farsani, E. Pazouki, "A transformer self-attention model for time series forecasting," J. Electr. Comput. Eng. Innovations (JECEI), 9(1): 1-10, 2021.

[36] B. Dacorogna, Introduction to the Calculus of Variation. World Scientific Publishing Company, 2004: 240.

[37] T.F. Chan, L.A. Vese, "Active contours without edges," IEEE Trans. Image Process., 10)2(: 266 – 277, 2001.

[38] "CAVIAR 2003 and 2004", accessed 23 February 2021.

[39] B. Song, R.J. Sethi, "Robust wide area tracking in single and multiple views," Rev. Lit. arts Am., 2011: 1-18, 2011.

[40] "ngsim peachtree street." accessed 23 February 2021.

[41] "Eleventh ieee international workshop PETS." accessed 23 February 2021.

[42] "Image Processing & Pattern Recognition Laboratory." accessed 23 February 2021.

[43] S.-. Inria, "Internal Technical note Metrics Definition version 2.0 – Approved," Inria, IN_ETI_1_004, 2006.

[44] Y. Li, C. Huang, R. Nevatia, "Learning to associate: HybridBoosted multi-target tracker for crowded scene," in Proc. IEEE Conference on Computer Vision and Pattern Recognition: 2953-2960, 2009.

LETTERS TO EDITOR

Journal of Electrical and Computer Engineering Innovations (JECEI) welcomes letters to the editor for the post-publication discussions and corrections which allows debate post publication on its site, through the Letters to Editor. Letters pertaining to manuscript published in JECEI should be sent to the editorial office of JECEI within three months of either online publication or before printed publication, except for critiques of original research. Following points are to be considering before sending the letters (comments) to the editor.

[1] Letters that include statements of statistics, facts, research, or theories should include appropriate references, although more than three are discouraged.

[2] Letters that are personal attacks on an author rather than thoughtful criticism of the author’s ideas will not be considered for publication.

[3] Letters can be no more than 300 words in length.

[4] Letter writers should include a statement at the beginning of the letter stating that it is being submitted either for publication or not.

[5] Anonymous letters will not be considered.

[6] Letter writers must include their city and state of residence or work.

[7] Letters will be edited for clarity and length.

Name *

Email Address *

Affiliation *

Comments *

Security Code *

Journal of Electrical and Computer Engineering Innovations (JECEI)

A Variational Level Set Approach to Multiphase Multi-Object Tracking in Camera Network Base on Deep Features

References

References

Send comment about this article

Volume 9, Issue 2
July 2021
Pages 203-214

A Variational Level Set Approach to Multiphase Multi-Object Tracking in Camera Network Base on Deep Features

References

References

Send comment about this article

Volume 9, Issue 2July 2021Pages 203-214

Volume 9, Issue 2
July 2021
Pages 203-214