A Novel Technique to Detect and Track Multiple Objects in Dynamic Video Surveillance Systems.

M. Adimoolam; Senthilkumar Mohan; John A.; Gautam Srivastava

doi:10.9781/ijimai.2022.01.002

Authors

M. Adimoolam Saveetha University
Senthilkumar Mohan Vellore Institute of Technology University
John A. Galgotias University
Gautam Srivastava Brandon University

DOI:

https://doi.org/10.9781/ijimai.2022.01.002

Keywords:

Convolutional Neural Network (CNN), Machine Learning, Object Detection, Video Surveillance

Abstract

Video surveillance is one of the important state of the art systems to be utilized in order to monitor different areas of modern society surveillance like the general public surveillance system, city traffic monitoring system, and forest monitoring system. Hence, surveillance systems have become especially relevant in the digital era. The needs of the video surveillance systems and its video analytics have become inevitable due to an increase in crimes and unethical behavior. Thus enabling the tracking of individuals object in video surveillance is an essential part of modern society. With the advent of video surveillance, performance measures for such surveillance also need to be improved to keep up with the ever increasing crime rates. So far, many methodologies relating to video surveillance have been introduced ranging from single object detection with a single or multiple cameras to multiple object detection using single or multiple cameras. Despite this, performance benchmarks and metrics need further improvements. While mechanisms exist for single or multiple object detection and prediction on videos or images, none can meet the criteria of detection and tracking of multiple objects in static as well as dynamic environments. Thus, real-world multiple object detection and prediction systems need to be introduced that are both accurate as well as fast and can also be adopted in static and dynamic environments. This paper introduces the Densely Feature selection Convolutional neural Network – Hyper Parameter tuning (DFCNHP) and it is a hybrid protocol with faster prediction time and high accuracy levels. The proposed system has successfully tracked multiple objects from multiple channels and is a combination of dense block, feature selection, background subtraction and Bayesian methods. The results of the experiment conducted demonstrated an accuracy of 98% and 1.11 prediction time and these results have also been compared with existing methods such as Kalman Filtering (KF) and Deep Neural Network (DNN).

Downloads

Download data is not yet available.

References

Ahn, H., and Cho, H., “Research of multi-object detection and tracking using machine learning based on knowledge for video surveillance system,” Personal and Ubiquitious Computing, pp. 1-10, 2019, doi: 10.1007/s00779-019-01296-z.

A. Raghunandan, Mohana, P. Raghav and H. V. R. Aradhya, “Object Detection Algorithms for Video Surveillance Applications,” International Conference on Communication and Signal Processing (ICCSP), 2018, pp. 0563-0568, doi: 10.1109/ICCSP.2018.8524461.

G. Chandan, A. Jain, H. Jain and Mohana, “Real Time Object Detection and Tracking Using Deep Learning and OpenCV,” International Conference on Inventive Research in Computing Applications (ICIRCA), 2018, pp. 1305-1308, doi: 10.1109/ICIRCA.2018.8597266.

S. Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017, doi: 39.10.1109/TPAMI.2016.2577031.

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91.

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A. C. Berg, “SSD: Single shot multibox detector,” Computer Vision in ECCV, 2016, pp. 1-17, doi.org/10.1007/978-3-319-46448-0_2.

Redmon, Joseph and Ali Farhadi. “YOLOv3: An Incremental Improvement” ArXiv abs/1804.02767, 2018, pp. 1-6.

R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 580-587, doi: 10.1109/CVPR.2014.81.

Uijlings, Jasper Sande, K. and Gevers, T. and Smeulders, Arnold, “Selective Search for Object Recognition,” International Journal of Computer Vision, vol. 104, pp. 154-171, doi: 10.1007/s11263-013-0620-5.

R. Girshick, “Fast R-CNN,” IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1440-1448, doi: 10.1109/ICCV.2015.169.

Kaiming HeXiangyu and ZhangShaoqing RenJian Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, pp. 1904-1916, 2014.

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, 1137–1149, 2017, DOI: https://doi.org/10.1109/TPAMI.2016.2577031

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” In International Conference on Learning Representations, 2015, pp.1-14.

He, Kaiming. “Mask R-CNN,” 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980-2988.

Sankaranarayanan, Aswin Veeraraghavan, Ashok Chellappa, Rama, “Object Detection, Tracking and Recognition for Multiple Smart Cameras,” Proceedings of the IEEE, vol. 96, no. 10, pp. 1606–1624, doi: 10.1109/JPROC.2008.928758.

C. R. del-Blanco, F. Jaureguizar and N. Garcia, “An efficient multiple object detection and tracking framework for automatic counting and video surveillance applications,” EEE Transactions on Consumer Electronics, vol.58, no.3, pp.857-862, August 2012, doi: 10.1109/TCE.2012.6311328.

K. S. Kumar, S. Prasad, P. K. Saroj and R. C. Tripathi, “Multiple Cameras Using Real Time Object Tracking for Surveillance and Security System,” 3rd International Conference on Emerging Trends in Engineering and Technology, 2010, pp. 213-218, doi: 10.1109/ICETET.2010.30.

Ray, Kumar S. and Soma Chakraborty, “An Efficient Approach for Object Detection and Tracking of Objects in a Video with Variable Background,” ArXiv abs/1706.02672, 2017, pp. 1-11.

Wenqian Liu, Octavia Camps, and Mario Sznaier, “Multi-camera Multi-Object Tracking,” ArXiv abs/1709.07065, 2017, pp.1-7.

Kachhava, Rajendra, Shrivasta, Vivek, Jain, Rajkumar, Chaturvedi, Ekta, “Security System and Surveillance Using Real-Time Object Tracking and Multiple Cameras,” Advanced Materials Research, vol. 403-408, 4968-4973, doi: 10.4028/www.scientific.net/AMR.403-408.4968.

Chen, Chung-Hao, Yao, Yi, Page, David, Abidi, Besma, Koschan, Andreas, Abidi, Mongi, “Heterogeneous Fusion of Omnidirectional and PTZ Cameras for Multiple Object Tracking. Circuits and Systems for Video Technology,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 8, pp. 1052-1063, doi: 10.1109/TCSVT.2008.928223.

Taj, Murtaza Cavallaro, Andrea, “Simultaneous Detection and Tracking with Multiple Cameras,” Studies in Computational Intelligence, 411, pp. 197-214, 2013, doi: 10.1007/978-3-642-28661-2_8.

R, Y. Da Xu and M. Kemp, “Fitting multiple connected ellipses to an image silhouette hierarchically,” IEEE Transactions. on Image Processing. vol. 19, no. 7, 1673-1682, Jul 2010.

Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu, “Object Detection With Deep Learning: A Review,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, pp. 3212-3232, 2019.

Licheng Jiao, Fan Zhang, Fang Liu, Shuyuan Yang, Lingling Li, Zhixi Feng, and Rong Qu, “A Survey of Deep Learning-Based Object Detection,” IEEE Access vol. 7, pp. 128837-128868, 2019.

Anjum, Nadeem Cavallaro, Andrea. “Trajectory Association and Fusion across Partially Overlapping Cameras,” sixth ieee international conference on advanced video and signal based surveillanc. Pp. 201-206,2009. 10.1109/AVSS.2009.65.

V. Morariu and O. Camps, “Modeling Correspondences for Multi-Camera Tracking Using Nonlinear Manifold Learning and Target Dynamics,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, NY, USA, 2006, pp. 545-552. doi: 10.1109/CVPR.2006.189.

Fleuret, F., Berclaz, J., Lengagne, R., Fua, P. “Multicamera people tracking with a probabilistic occupancy map,” IEEE Transactions on Pattern Analysis and Machine Intelligence 30(2), 267–282 (2008).

Eshel, R., Moses, Y. “Homography-based multiple camera detection and tracking of people in a dense crowd,” In: Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition, Anchorage, AK, USA (June 2008).

Kim, K., Davis, L.S.: “Multi-Camera Tracking and Segmentation of Occluded People on Ground Plane Using Search-Guided Particle Filtering,” European Conference on Computer Vision. Springer, Berlin, Heidelberg, 2006.

Taj, Murtaza Cavallaro, Andrea. “Multi-camera track-before-detect,” 3rd ACM/IEEE International Conference on Distributed Smart Cameras, ICDSC 2009. 1-6. 10.1109/ICDSC.2009.5289405.

Tan, Yihua Tai, Yuan Xiong, Shengzhou. “NCA-Net for Tracking Multiple Objects across Multiple Cameras,” Sensors. 18. 3400. 10.3390/s18103400,2018.

Hossain, Sabir, and Deok-Jin Lee. “Deep Learning-Based Real-Time Multiple-Object Detection and Tracking from Aerial Imagery via a Flying Robot with GPU-Based Embedded Devices,” Sensors (Basel, Switzerland) vol. 19, 15 3371. 31 Jul. 2019, doi: 10.3390/s19153371.

Yoon, Kwangjin. “Data Association for Multi-Object Tracking via Deep Neural Networks,” Sensors (Basel, Switzerland) vol. 19, no. 3 pp. 559. 29 Jan. 2019, doi: 10.3390/s19030559.

Sikora P, Malina L, Kiac M, Martinasek Z, Riha K, Prinosil J, Jirik L, Srivastava G. “Artificial Intelligence-based Surveillance System for Railway Crossing Traffic,” IEEE Sensors Journal. 2020 Oct 16.

Vallathan G, John A, Thirumalai C, Mohan S, Srivastava G, Lin JC. “Suspicious activity detection using deep learning in secure assisted living IoT environments,” The Journal of Supercomputing. 2020 Jul 30:1-9.

Wang X, Srivastava G. “The security of vulnerable senior citizens through dynamically sensed signal acquisition,” Transactions on Emerging Telecommunications Technologies. 2020 Jul 14:e4037.T.

Ahmad Jalal, Shaharyar Kamal and Daijin Kim, “A Depth Video-based Human Detection and Activity Recognition using Multi-features and Embedded Hidden Markov Models for Health Care Monitoring Systems,” International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 4, No. 4, pp. 54-62, 2017.

Anahita Ghazvini, Siti Norul Huda Sheikh Abdullah, Masri Ayob, “A Recent Trend in Individual Counting Approach Using Deep Network,” International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 5, No. 5, pp. 7-14, 2019.

T. Ojala, M. Pietikäinen, and D. Harwood, “A comparative study of texture measures with classification based on featured distributions,” Pattern Recognition, vol. 29, no. 1, pp. 51–59, 1996.

Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 12, pp. 2037–2041, 2006.

Z. Zivkovic and F. van der Heijden, “Efficient adaptive density estimation per image pixel for the task of background subtraction,” Pattern Recognition Letters, vol. 27, pp. 773–780, 2006.

Rakesh Chandra Joshi, Adithya Gaurav Singh, Mayank Joshi, Sanjay Mathur, “A Low Cost and Computationally Efficient Approach for Occlusion Handling in Video Surveillance Systems,” International Journal of Interactive Multimedia and Artificial Intelligence, Vol. 5, No. 7, pp. 28-38 2019.

Krizhevsky, Alex. “Learning Multiple Layers of Features from Tiny Images,” Technical Report TR-2009, University of Toronto, Toronto.

Sadeghian, A.; Alahi, A.; Savarese, S. Tracking, “The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies,” arXiv 2017, arXiv:1701.01909.

Taj, Murtaza. “Tracking interacting targets in multi-modal,” sensors. Diss. 2009.

Chen, Muchun, et al. “Real-Time Multiple Pedestrians Tracking in Multi-camera System,” International Conference on Multimedia Modeling. Springer, Cham, 2020.

Xiaokai, Liu, Wang Hongyu, and Gao Hongbo. “Camera matching based on spatiotemporal activity and conditional random field model,” IET Computer Vision 8.6 (2014): 487-497.