Multiscale Attentional Squeeze-And-Excitation Network for Person Re-Identification.

Tiancun Guo; Qiang Zhou; Mingliang Gao; Gwanggil Jeon; David Camacho

doi:10.9781/ijimai.2025.01.001

Authors

Tiancun Guo Shandong University of Technology.
Qiang Zhou Shandong University of Technology.
Mingliang Gao Shandong University of Technology.
Gwanggil Jeon Incheon National University.
David Camacho Technical University of Madrid.

DOI:

https://doi.org/10.9781/ijimai.2025.01.001

Keywords:

Attention Mechanisms, Person Re-Identification, Cross-Domain, Multiscale

Supporting Agencies

This work has been funded by Grants: PLEC2021-007681 (XAIDisInfodemics), PID2020-117263GB-100 (FightDIS), and PCI2022- 134990-2 (MARTINI) of the CHISTERA IV Cofund 2021 program, funded by MCIN/AEI/10.13039/ 501100011033 and by the “European Union NextGenerationEU/PRTR”; by Calouste Gulbenkian Foundation, under the project MuseAI - Detecting and matching suspicious claims with AI, and by “Convenio Plurianual with the Universidad Polit’ecnica de Madrid in the actuation line of Programa de Excelencia para el Profesorado Universitario”.

Abstract

In recent years, with the advancement of deep learning, person re-identification (Re-ID) has become increasingly significant. The existing person Re-ID methods primarily focus on optimizing network architecture to enhance Re-ID task performance. However, these methods often overlook the importance of valuable features in distinguishing Re-ID tasks, leading to reduced model efficacy in complex scenarios. As a solution, we utilize the attention mechanism to develop the lightweight multiscale Attentional Squeeze-and-Excitation Network (MASENet) that can distinguish between significant and non-significant features. Specifically, we utilize the SEAttention (SE) module to amplify important feature channels and suppress redundant ones. Additionally, the Spatial Group Enhance (SGE) module is introduced to enable networks to enhance semantic learning expression and suppress potential noise autonomously. We conduct comprehensive experiments on Market1501, MSMT17, and VeRi-776 datasets and cross-domain experiments on MSMT17 Ñ Market1501 to validate the model performance. Experimental results prove that the proposed MASENet achieves competitive performance across all experiments.

Downloads

Download data is not yet available.

References

Y. Dai, J. Liu, Y. Bai, Z. Tong, L.-Y. Duan, “Dual- refinement: Joint label and feature refinement for unsupervised domain adaptive person re-identification,” IEEE Transactions on Image Processing, vol. 30, pp. 7815–7829, 2021.

B. Yang, J. Chen, M. Ye, “Towards grand unified representation learning for unsupervised visible- infrared person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 11069–11079.

C. Zhao, X. Lv, Z. Zhang, W. Zuo, J. Wu, D. Miao, “Deep fusion feature representation learning with hard mining center-triplet loss for person re-identification,” IEEE Transactions on Multimedia, vol. 22, no. 12, pp. 3180–3195, 2020.

K. Zhou, Y. Yang, A. Cavallaro, T. Xiang, “Learning generalisable omni-scale representations for person re-identification,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 9, pp. 5056–5069, 2021.

C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 2818–2826.

M. Ye, J. Shen, G. Lin, T. Xiang, L. Shao, S. C. Hoi, “Deep learning for person re-identification: A survey and outlook,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 6, pp. 2872–2893, 2021.

H. Liu, K. Simonyan, Y. Yang, “Darts: Differentiable architecture search,” arXiv preprint arXiv:1806.09055, 2018.Available online: https://arxiv.org/abs/1806.09055.

X. Dong, Y. Yang, “Searching for a robust neural architecture in four gpu hours,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1761–1770.

J. Gu, K. Wang, H. Luo, C. Chen, W. Jiang, Y. Fang, S. Zhang, Y. You, J. Zhao, “Msinet: Twins contrastive search of multi-scale interaction for object reid,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19243–19253.

G. Zou, G. Fu, X. Peng, Y. Liu, M. Gao, Z. Liu, “Person re-identification based on metric learning: a survey,” multimedia tools and applications, vol. 80, no. 17, pp. 26855–26888, 2021.

S. Liao, S. Z. Li, “Efficient psd constrained asymmetric metric learning for person re-identification,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 3685–3693.

Y. Liu, G. Zou, G. Chen, M. Gao, L. Yin, “Unsupervised person re-identification based on distribution regularization constrained asymmetric metric learning,” Applied Intelligence, vol. 53, no. 23, pp. 28879–28894, 2023.

B. Chen, W. Deng, J. Hu, “Mixed high-order attention network for person re-identification,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 371–381.

H. Liu, J. Feng, M. Qi, J. Jiang, S. Yan, “End-to- end comparative attention networks for person re-identification,” IEEE transactions on image processing, vol. 26, no. 7, pp. 3492–3506, 2017.

G. Chen, T. Gu, J. Lu, J.-A. Bao, J. Zhou, “Person re- identification via attention pyramid,” IEEE Transactions on Image Processing, vol. 30, pp. 7663–7676, 2021.

H. Luo, W. Jiang, Y. Gu, F. Liu, X. Liao, S. Lai, J. Gu, “A strong baseline and batch normalization neck for deep person re-identification,” IEEE Transactions on Multimedia, vol. 22, no. 10, pp. 2597–2609, 2019.

H. Li, G. Wu, W.-S. Zheng, “Combined depth space based architecture search for person re-identification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6729–6738.

D. Li, X. Chen, Z. Zhang, K. Huang, “Learning deep context-aware features over body and latent parts for person re-identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 384–393.

Y. Ge, Z. Li, H. Zhao, G. Yin, S. Yi, X. Wang, et al., “Fd-gan: Pose-guided feature distilling gan for robust person re-identification,” Advances in neural information processing systems, vol. 31, 2018.

D. Cheng, Y. Gong, S. Zhou, J. Wang, N. Zheng, “Person re-identification by multi-channel parts-based cnn with improved triplet loss function,” in Proceedings of the iEEE conference on computer vision and pattern recognition, 2016, pp. 1335–1344.

Y. Yuan, J. Zhang, Q. Wang, “Deep gabor convolution network for person re-identification,” Neurocomputing, vol. 378, pp. 387–398, 2020.

H. Gu, J. Li, G. Fu, C. Wong, X. Chen, J. Zhu, “Autoloss- gms: Searching generalized margin-based softmax loss function for person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 4744–4753.

W. Chen, X. Chen, J. Zhang, K. Huang, “Beyond triplet loss: a deep quadruplet network for person re- identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 403– 412.

G. Chen, J. Lu, M. Yang, J. Zhou, “Spatial-temporal attention-aware learning for video-based person re- identification,” IEEE Transactions on Image Processing, vol. 28, no. 9, pp. 4192–4205, 2019.

Y. Sun, Q. Xu, Y. Li, C. Zhang, Y. Li, S. Wang, J. Sun, “Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 393–402.

H. Tan, X. Liu, B. Yin, X. Li, “MHSA-Net: Multihead self-attention network for occluded person re-identification,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 11, pp. 8210-8224, 2023.

X. Guo, M. Gao, W. Zhai, J. Shang, Q. Li, “Spatial-frequency attention network for crowd counting,” Big data, vol. 10, no. 5, pp. 453–465, 2022.

S. Jiao, J. Wang, G. Hu, Z. Pan, L. Du, J. Zhang, “Joint attention mechanism for person re-identification,” IEEE Access, vol. 7, pp. 90497–90506, 2019.

R. Quan, X. Dong, Y. Wu, L. Zhu, Y. Yang, “Auto- reid: Searching for a part-aware convnet for person re-identification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3750–3759.

W. Zhai, M. Gao, Q. Li, G. Jeon, M. Anisetti, “Fpanet: feature pyramid attention network for crowd counting,” Applied Intelligence, vol. 53, no. 16, pp. 19199–19216, 2023.

J. Xu, R. Zhao, F. Zhu, H. Wang, W. Ouyang, “Attention-aware compositional network for person re- identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2119– 2128.

C. Song, Y. Huang, W. Ouyang, L. Wang, “Mask- guided contrastive attention model for person re- identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1179– 1188.

G. Chen, J. Lu, M. Yang, J. Zhou, “Learning recurrent 3d attention for video-based person re-identification,” IEEE Transactions on Image Processing, vol. 29, pp. 6963– 6976, 2020.

G. Chen, Y. Rao, J. Lu, J. Zhou, “Temporal coherence or temporal motion: Which is more critical for video- based person re-identification?,” in Computer Vision– ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VIII 16, 2020, pp. 660–676, Springer.

Z. Zheng, X. Wang, N. Zheng, Y. Yang, “Parameter-efficient person re-identification in the 3d space,” IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 6, pp. 7534-7547, 2024.

M. Munaro, A. Fossati, A. Basso, E. Menegatti, L. Van Gool, “One-shot person re-identification with a consumer depth camera,” Person Re-Identification, pp. 161–181, 2014.

A. Haque, A. Alahi, L. Fei-Fei, “Recurrent attention models for depth-based person identification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1229–1238.

N. Karianakis, Z. Liu, Y. Chen, S. Soatto, “Person depth reid: Robust person re-identification with commodity depth sensors,” arXiv preprint arXiv:1705.09882, 2017.

J. Si, H. Zhang, C.-G. Li, J. Kuen, X. Kong, A. C. Kot, G. Wang, “Dual attention matching network for context-aware feature sequence based person re- identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5363– 5372.

T. Chen, S. Ding, J. Xie, Y. Yuan, W. Chen, Y. Yang, Z. Ren, Z. Wang, “Abd-net: Attentive but diverse person re-identification,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8351–8361.

X. Chen, C. Fu, Y. Zhao, F. Zheng, J. Song, R. Ji, Y. Yang, “Salience-guided cascaded suppression network for person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 3300–3310.

Y. Ge, F. Zhu, D. Chen, R. Zhao, et al., “Self-paced contrastive learning with hybrid memory for domain adaptive object re-id,” Advances in neural information processing systems, vol. 33, pp. 11309–11321, 2020.

S. Woo, J. Park, J.-Y. Lee, I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3–19.

K. Zhou, Y. Yang, A. Cavallaro, T. Xiang, “Omni- scale feature learning for person re-identification,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 3702–3712.

J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, H. Lu, “Dual attention network for scene segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3146–3154.

X. Li, X. Hu, J. Yang, “Spatial group-wise enhance: Improving semantic feature learning in convolutional networks,” arXiv preprint arXiv:1905.09646, 2019. Available online: https://arxiv.org/abs/1905.09646.

Y. Wu, K. He, “Group normalization,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3–19.

S. Qiao, H. Wang, C. Liu, W. Shen, A. Yuille, “Micro-batch training with batch-channel normalization and weight standardization,” arXiv preprint arXiv:1903.10520, 2019. Available online: https://arxiv.org/abs/1903.10520

L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, Q. Tian, “Scalable person re-identification: A benchmark,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1116–1124.

L. Wei, S. Zhang, W. Gao, Q. Tian, “Person transfer gan to bridge domain gap for person re-identification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 79–88.

X. Liu, W. Liu, T. Mei, H. Ma, “A deep learning-based approach to progressive vehicle re-identification for urban surveillance,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, 2016, pp. 869–884.

X. Liu, W. Liu, H. Ma, H. Fu, “Large-scale vehicle re-identification in urban surveillance videos,” in 2016 IEEE international conference on multimedia and expo (ICME), 2016, pp. 1–6.

J. Wu, Y. Yang, Z. Lei, Y. Yang, S. Chen, S. Z. Li, “Camera-aware representation learning for person re-identification,” Neurocomputing, vol. 518, pp. 155–164, 2023.

Z. Zhang, C. Lan, W. Zeng, X. Jin, Z. Chen, “Relation-aware global attention for person re-identification,” in Proceedings of the ieee/cvf conference on computer vision and pattern recognition, 2020, pp. 3186–3195.

H. Gu, G. Fu, J. Li, J. Zhu, “Auto-reid+: Searching for a multi-branch convnet for person re-identification,” Neurocomputing, vol. 435, pp. 53–66, 2021.

P. Fang, J. Zhou, S. K. Roy, L. Petersson, M. Harandi, “Bilinear attention networks for person retrieval,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8030–8039.

C. Luo, Y. Chen, N. Wang, Z. Zhang, “Spectral feature transformation for person re-identification,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 4976–4985.

M. Jia, X. Cheng, S. Lu, J. Zhang, “Learning disentangled representation implicitly via transformer for occluded person re-identification,” IEEE Transactions on Multimedia, vol. 25, pp. 1294–1305, 2022.

G. Xie, X. Wen, L. Yuan, H. Xu, Z. Liu, “Global correlative network for person re-identification,” Neurocomputing, vol. 469, pp. 298–309, 2022.

Y. Li, J. He, T. Zhang, X. Liu, Y. Zhang, F. Wu, “Diverse part discovery: Occluded person re-identification with part-aware transformer,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 2898–2907.

A. Zhang, Y. Gao, Y. Niu, W. Liu, Y. Zhou, “Coarse- to-fine person re-identification with auxiliary-domain classification and second-order information bottleneck,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 598–607.

H. Luo, Y. Gu, X. Liao, S. Lai, W. Jiang, “Bag of tricks and a strong baseline for deep person re-identification,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2019, pp. 1487-1495.

G. Wang, Y. Yuan, X. Chen, J. Li, X. Zhou, “Learning discriminative features with multiple granularities for person re-identification,” in Proceedings of the 26th ACM international conference on Multimedia, 2018, pp. 274–282.

K. Zhu, H. Guo, Z. Liu, M. Tang, J. Wang, “Identity- guided human semantic parsing for person re-identification,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, 2020, pp. 346–363, Springer.