| Peer-Reviewed

Pedestrian Tracking Algorithm Combining Contextual Information and Attention Mechanism

Received: 8 October 2021    Accepted: 25 October 2021    Published: 5 November 2021
Views:       Downloads:
Abstract

In the real scene, because pedestrians are occluded or the size of pedestrians is small, the convolutional neural network cannot fully extract their features, resulting in poor detection results. In two adjacent frames, the same pedestrian is prone to errors when doing data association, which makes the pedestrian tracking effect unsatisfactory. In order to solve this problem, the pedestrian tracking algorithm based on Anchor-free idea is improved. A fusion context information module is proposed to enhance the model's feature extraction ability for different receptive fields, and improve the model's detection and tracking performance when the pedestrian size is small. In addition, in order to let the model learn to pay attention to the effective information of the feature layer. A coordinated attention mechanism is introduced to guide the model to learn the weights of different channels and different regions of the feature layer, and to improve the tracking performance of the model when pedestrians are occluded. In the experiment, the tracking performance of the model was verified on the MOT16 dataset. Experimental results show that compared with other main popular person tracking algorithms, the improved algorithm has higher tracking accuracy and lower pedestrian ID switching times. Its tracking accuracy is 70.74.

Published in American Journal of Computer Science and Technology (Volume 4, Issue 4)
DOI 10.11648/j.ajcst.20210404.14
Page(s) 111-118
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Pedestrian Tracking, Anchor-Free, Context Information, Attention Mechanism, JDE

References
[1] Claparrone G, Sanchez F L, Tabik S. Deep learning in video multi-object tracking: A survey [J]. Neurocomputing, 2020, 381: 61-88.
[2] Zhang Y, Lu H Z, Zhang L P. Overview of Visual Multi-object Tracking Algorithms with Deep Learning [J] Computer Engineering and Applications, 2021, 57 (13): 55-66.
[3] Voigtlaender P, Krause M, Osep A. Mots: Multi-object tracking and segmentation [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2019: 7942-7951.
[4] Wang Z, Zheng L, Liu Y. Towards real-time multi-object tracking [C]//European Conference on Computer Vision. Glasgow: Springer, 2020: 107-122.
[5] Zhang Y, Wang C, Wang X. Fairmot: On the fairness of detection and re-identification in multiple object tracking [J]. International Journal of Computer Vision, 2021: 1-19.
[6] Zhou X, Koltun V, Krahenbuhl P. Tracking objects as points [C]//European Conference on Computer Vision. Springer, Cham, 2020: 474-490.
[7] Szegedy C, Loffe S, Vanhoucke V. Inception-v4, inception-resnet and the impact of residual connections on learning [C]//The AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2017.
[8] Liu W, Lei H, Xie H. Multi-level Light U-Net and Atrous Spatial Pyramid Pooling for Optic Disc Segmentation on Fundus Image [C]//International Workshop on Ophthalmic Medical Image Analysis. Springer, Cham, 2020: 104-113.
[9] Liu S, Huang D, Wang Y. Receptive field block net for accurate and fast object detection [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2018: 385-400.
[10] Dai J, Qi H, Xiong Y. Deformable convolutional networks [C]//Proceedings of the IEEE International Conference on Computer Vision. New York: IEEE Press, 2017: 764-773.
[11] Haase D, Amthor M. Rethinking depthwise separable convolutions: How intra-kernel correlations lead to improved Mobile Nets [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 14600-14609.
[12] Hu J, Shen L, Sun G. Squeeze-and-Excitation Networks [C]//IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake: IEEE Press, 2018: 7132-7141.
[13] Park J, Woo S, Lee J Y. A simple and light-weight attention module for convolutional neural networks [J]. International Journal of Computer Vision, 2020, 128 (4): 783-798.
[14] Hou Q, Zhou D, Feng J. Coordinate attention for efficient mobile network design [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2021: 13713-13722.
[15] Dollar P, Wojek C, Schiele B. Pedestrian detection: A benchmark [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2009: 304-311.
[16] Milan A, Leal-taixel L, Reid I. MOT16: A benchmark for multi-object tracking [J]. arXiv preprint arXiv: 1603. 00831, 2016.
[17] Xiao T, Li S, Wang B. Joint detection and identification feature learning for person search [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 3415-3424.
[18] Zheng L, Zhang H, Sun S. Person re-identification in the wild [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 1367-1376.
[19] Ess A, Leibe B, Schindler K. A mobile vision system for robust multi-person tracking [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2008: 1-8.
[20] Zhang S, Benenson R, Schiele B. Citypersons: A diverse dataset for pedestrian detection [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2017: 3213-3221.
[21] Cheng B, Xiao B, Wang J. Higherhrnet: Scale-aware representation learning for bottom-up human pose estimation [C]//IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE Press, 2020: 5386-5395.
[22] Pang B, Li Y, Zhang Y. Tubetk: Adopting tubes to track multi-object in a one-step training model [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 6308-6318.
[23] Mahmoudi N, Ahadi S M, Rahmati M. Multi-target tracking using CNN-based features: CNNMTT [J]. Multimedia Tools and Applications, 2019, 78 (6): 7077-7096.
[24] Peng J, Wang C, Wan F. Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking [C]//European Conference on Computer Vision. Springer, Cham, 2020: 145-161.
Cite This Article
  • APA Style

    Shunliang Xiao, Zanxia Qiang, Weiguang Liu, Xianfu Bao. (2021). Pedestrian Tracking Algorithm Combining Contextual Information and Attention Mechanism. American Journal of Computer Science and Technology, 4(4), 111-118. https://doi.org/10.11648/j.ajcst.20210404.14

    Copy | Download

    ACS Style

    Shunliang Xiao; Zanxia Qiang; Weiguang Liu; Xianfu Bao. Pedestrian Tracking Algorithm Combining Contextual Information and Attention Mechanism. Am. J. Comput. Sci. Technol. 2021, 4(4), 111-118. doi: 10.11648/j.ajcst.20210404.14

    Copy | Download

    AMA Style

    Shunliang Xiao, Zanxia Qiang, Weiguang Liu, Xianfu Bao. Pedestrian Tracking Algorithm Combining Contextual Information and Attention Mechanism. Am J Comput Sci Technol. 2021;4(4):111-118. doi: 10.11648/j.ajcst.20210404.14

    Copy | Download

  • @article{10.11648/j.ajcst.20210404.14,
      author = {Shunliang Xiao and Zanxia Qiang and Weiguang Liu and Xianfu Bao},
      title = {Pedestrian Tracking Algorithm Combining Contextual Information and Attention Mechanism},
      journal = {American Journal of Computer Science and Technology},
      volume = {4},
      number = {4},
      pages = {111-118},
      doi = {10.11648/j.ajcst.20210404.14},
      url = {https://doi.org/10.11648/j.ajcst.20210404.14},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajcst.20210404.14},
      abstract = {In the real scene, because pedestrians are occluded or the size of pedestrians is small, the convolutional neural network cannot fully extract their features, resulting in poor detection results. In two adjacent frames, the same pedestrian is prone to errors when doing data association, which makes the pedestrian tracking effect unsatisfactory. In order to solve this problem, the pedestrian tracking algorithm based on Anchor-free idea is improved. A fusion context information module is proposed to enhance the model's feature extraction ability for different receptive fields, and improve the model's detection and tracking performance when the pedestrian size is small. In addition, in order to let the model learn to pay attention to the effective information of the feature layer. A coordinated attention mechanism is introduced to guide the model to learn the weights of different channels and different regions of the feature layer, and to improve the tracking performance of the model when pedestrians are occluded. In the experiment, the tracking performance of the model was verified on the MOT16 dataset. Experimental results show that compared with other main popular person tracking algorithms, the improved algorithm has higher tracking accuracy and lower pedestrian ID switching times. Its tracking accuracy is 70.74.},
     year = {2021}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Pedestrian Tracking Algorithm Combining Contextual Information and Attention Mechanism
    AU  - Shunliang Xiao
    AU  - Zanxia Qiang
    AU  - Weiguang Liu
    AU  - Xianfu Bao
    Y1  - 2021/11/05
    PY  - 2021
    N1  - https://doi.org/10.11648/j.ajcst.20210404.14
    DO  - 10.11648/j.ajcst.20210404.14
    T2  - American Journal of Computer Science and Technology
    JF  - American Journal of Computer Science and Technology
    JO  - American Journal of Computer Science and Technology
    SP  - 111
    EP  - 118
    PB  - Science Publishing Group
    SN  - 2640-012X
    UR  - https://doi.org/10.11648/j.ajcst.20210404.14
    AB  - In the real scene, because pedestrians are occluded or the size of pedestrians is small, the convolutional neural network cannot fully extract their features, resulting in poor detection results. In two adjacent frames, the same pedestrian is prone to errors when doing data association, which makes the pedestrian tracking effect unsatisfactory. In order to solve this problem, the pedestrian tracking algorithm based on Anchor-free idea is improved. A fusion context information module is proposed to enhance the model's feature extraction ability for different receptive fields, and improve the model's detection and tracking performance when the pedestrian size is small. In addition, in order to let the model learn to pay attention to the effective information of the feature layer. A coordinated attention mechanism is introduced to guide the model to learn the weights of different channels and different regions of the feature layer, and to improve the tracking performance of the model when pedestrians are occluded. In the experiment, the tracking performance of the model was verified on the MOT16 dataset. Experimental results show that compared with other main popular person tracking algorithms, the improved algorithm has higher tracking accuracy and lower pedestrian ID switching times. Its tracking accuracy is 70.74.
    VL  - 4
    IS  - 4
    ER  - 

    Copy | Download

Author Information
  • School of Computer Science, Zhongyuan University of Technology, Zhengzhou, China

  • School of Computer Science, Zhongyuan University of Technology, Zhengzhou, China

  • School of Computer Science, Zhongyuan University of Technology, Zhengzhou, China

  • School of Computer Science, Zhongyuan University of Technology, Zhengzhou, China

  • Sections