Document Type : Original Research Paper
Authors
1 Artificial Intelligence Department, Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran
2 Artificial Intelligent and Robotics Department, Faculty of Computer and Information Technology Engineering, Amirkabir University of Technology, Tehran, Iran
Abstract
Background and Objectives: Object tracking in video streams is one of the issues in machine vision that has many applications. Depending on the type of the object, the number of objects and other inputs used in tracking, object tracking is divided into several different categories. Multi-object tracking in a camera network is one of the most complex types of object tracking. In this type of tracking, the goal of the algorithm is to extract the persistent trace of several objects moving simultaneously in a wide area that is monitored by a network of cameras. This type of tracking is often done in two steps. In the first step, the traces of each object in each camera is called tracklets are extracted. Then, the persistent trace of the objects are obtained by associating the extracted tracklets of all cameras in the monitored wide area. Here, we introduce a novel variational approach based on the deep features to associate the tracklets.
Methods: For this purpose a variational model with multiphase level set representation is introduced. The persistent trace of all objects are obtained by optimizing the proposed variational model. The proposed variational model is optimized by employing the Euler-Lagrange equation. CNN and deep learning are used to extract the deep features of appearance and motion of objects. Here, a ResNet50 network that is pre-trained on ImageNet and a transformer neural network which is trained with motion parameters of tracklets such as acceleration and orientation change rate are used for extracting deep features.
Results: The results on the three well-known datasets which are real and a synthesized dataset show that the proposed model takes competitive performance, while using less extra context information of the camera network and objects, compared to the other proposed methods. The evaluations show the quality of the proposed model in solving complex problems using the minimum required initial knowledge.
Conclusion: The multiphase model using deep features presented in this paper provide 9% better results than the multiphase model without deep features based on TCF and FS metrics and 8% better results based on MT metric.
Keywords
- Multi-Object Tracking
- Camera Network Tracking
- MultiPhase Level Set Representation
- Variational Tracking
- Deep Features
Main Subjects
Open Access
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit: http://creativecommons.org/licenses/by/4.0/
Publisher’s Note
JECEI Publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Publisher
Shahid Rajaee Teacher Training University
Send comment about this article