IAtraj: Multi-Modal Trajectory Prediction Through Contextual Information Spatio-Temporal Interaction and Awareness

Xiaoliang Wang; Lian Zhou; Kuan Ching Li; Shiqi Zheng; Huijing Fan

doi:10.9781/ijimai.2024.09.001

Authors

Xiaoliang Wang Hunan University of Science and Technology
Lian Zhou Hunan University of Science and Technology
Kuan Ching Li Providence University
Shiqi Zheng Hunan University of Science and Technology
Huijing Fan Hunan University of Science and Technology

DOI:

https://doi.org/10.9781/ijimai.2024.09.001

Keywords:

Attention Mechanisms, Contextual Information, Multi-Modal, Spatio-Temporal Interaction and Awareness, Trajectory Prediction

Supporting Agencies

This research was supported by the Scientific Research Fund of National Natural Science Foundation of China (Grant No. 62372168), Hunan Provincial Natural Science Foundation of China (Grant No. 2023JJ30266), Research Project on teaching reform in Hunan province(No. HNJG-2022-0791) and Hunan University of Science and Technology(No. 2022-44-8), National Social Science Funds of China (19BZX044).

Abstract

Accurately and feasibly predicting the future trajectories of autonomous vehicles is a critically important task. However, this task faces significant challenges due to the variability of driving intentions and the complexity of social interactions. These challenges primarily arise from the need to understand one’s driving behaviors and model the interaction information of the surrounding environment. A substantial amount of research has been focused on integrating interaction information from the surrounding environment, mainly using raster images or High-Definition maps (HD maps). However, the real-time update of environmental maps and the high computational cost associated with processing interaction information using compatible technologies such as vision have become limiting factors. Additionally, ineffective simulation and modeling of real driving scenarios, coupled with inadequate understanding of contextual environmental information, result in lower prediction accuracy. To overcome these challenges, we propose a multi-modal trajectory prediction model based on sequence modeling namely IAtraj, incorporating multiple attention mechanisms, focuses on the three critical elements in real traffic scenarios: the target agent’s historical trajectory, effective interactions with neighboring vehicles, and lane supervision and retention strategies. To better model these elements, we design modules for Temporal Interaction (TI), Spatial Interaction (SI), and Lane Awareness (LA). Through extensive experiments conducted on the publicly available nuScenes dataset, IAtraj exhibits outstanding performance, successfully addressing the challenges of temporal dependencies in trajectory sequences and the representation of scene changes. Finally, comprehensive ablation experiments validate the effectiveness of each significant module, reinforcing the reliability and robustness of IAtraj in dealing with complex traffic scenarios.

Downloads

Download data is not yet available.

References

L. P. Qian, A. Feng, N. Yu, W. Xu, Y. Wu, “Vehicular networking-enabled vehicle state prediction via two- level quantized adaptive kalman filtering,” IEEE Internet of Things Journal, vol. 7, no. 8, pp. 7181–7193, 2020.

G. Xie, H. Gao, L. Qian, B. Huang, K. Li, J. Wang, “Vehicle trajectory prediction by integrating physics-and maneuver-based approaches using interactive multiple models,” IEEE Transactions on Industrial Electronics, vol. 65, no. 7, pp. 5999–6008, 2017.

C. Ju, Z. Wang, C. Long, X. Zhang, D. E. Chang, “Interaction-aware kalman neural networks for trajectory prediction,” in 2020 IEEE Intelligent Vehicles Symposium (IV), 2020, pp. 1793–1800, IEEE.

J. Chen, C. Zhang, J. Luo, J. Xie, Y. Wan, “Driving maneuvers prediction based autonomous driving control by deep monte carlo tree search,” IEEE Transactions on Vehicular Technology, vol. 69, no. 7, pp. 7146–7158, 2020.

M. Goldhammer, S. Köhler, S. Zernetsch, K. Doll, B. Sick, K. Dietmayer, “Intentions of vulnerable road users—detection and forecasting by means of machine learning,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 7, pp. 3035–3045, 2019.

X. Shi, Y. D. Wong, C. Chai, M. Z.-F. Li, “An automated machine learning (automl) method of risk prediction for decision-making of autonomous vehicles,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 11, pp. 7145–7154, 2020.

F. Altché, A. de La Fortelle, “An lstm network for highway trajectory prediction,” in 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), 2017, pp. 353–359, IEEE.

K. Messaoud, I. Yahiaoui, A. Verroust-Blondet, F. Nashashibi, “Attention based vehicle trajectory prediction,” IEEE Transactions on Intelligent Vehicles, vol. 6, no. 1, pp. 175–185, 2020.

B. Kim, S. H. Park, S. Lee, E. Khoshimjonov, D. Kum, J. Kim, J. S. Kim, J. W. Choi, “Lapred: Lane-aware prediction of multi-modal future trajectories of dynamic agents,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14636–14645.

H. Cheng, M. Liu, L. Chen, H. Broszio, M. Sester, M. Y. Yang, “Gatraj: A graph-and attention-based multi- agent trajectory prediction model,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 205, pp. 163– 175, 2023.

K. Gao, X. Li, B. Chen, L. Hu, J. Liu, R. Du, Y. Li, “Dual transformer based prediction for lane change intentions and trajectories in mixed traffic environment,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 6, pp. 6203–6216, 2023.

Z. Li, Y. Wang, Z. Zuo, “Interaction-aware prediction for cut-in trajectories with limited observable neighboring vehicles,” IEEE Transactions on Intelligent Vehicles, vol. 8, no. 3, pp. 2148–2161, 2023.

N. Deo, E. Wolff, O. Beijbom, “Multimodal trajectory prediction conditioned on lane-graph traversals,” in Conference on Robot Learning, 2022, pp. 203–212, PMLR.

M. Liu, H. Cheng, L. Chen, H. Broszio, J. Li, R. Zhao, M. Sester, M. Y. Yang, “Laformer: Trajectory prediction for autonomous driving with lane-aware scene constraints,” arXiv preprint arXiv:2302.13933, 2023.

H. Cui, V. Radosavljevic, F.-C. Chou, T.-H. Lin, T. Nguyen, T.-K. Huang, J. Schneider, N. Djuric, “Multimodal trajectory predictions for autonomous driving using deep convolutional networks,” in 2019 International Conference on Robotics and Automation (ICRA), 2019, pp. 2090–2096, IEEE.

A. Sadeghian, V. Kosaraju, A. Sadeghian, N. Hirose, H. Rezatofighi, S. Savarese, “Sophie: An attentive gan for predicting paths compliant to social and physical constraints,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 1349–1358.

X. Chen, Z. Wang, Q. Hua, W.-L. Shang, Q. Luo, K. Yu, “Ai-empowered speed extraction via port-like videos for vehicular trajectory analysis,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 4, pp. 4541–4552, 2022.

N. Hu, D. Zhang, K. Xie, W. Liang, M.-Y. Hsieh, “Graph learning-based spatial-temporal graph convolutional neural networks for traffic forecasting,” Connection Science, vol. 34, no. 1, pp. 429–448, 2022.

N. Hu, D. Zhang, K. Xie, W. Liang, C. Diao, K.-C. Li, “Multi-range bidirectional mask graph convolution based gru networks for traffic prediction,” Journal of Systems Architecture, vol. 133, p. 102775, 2022.

C. Diao, D. Zhang, W. Liang, K.-C. Li, Y. Hong, J.-L. Gaudiot, “A novel spatial-temporal multi-scale alignment graph neural network security model for vehicles prediction,” IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 1, pp. 904–914, 2022.

J. Gao, C. Sun, H. Zhao, Y. Shen, D. Anguelov, C. Li, C. Schmid, “Vectornet: Encoding hd maps and agent dynamics from vectorized representation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11525–11533.

Z. Sheng, Y. Xu, S. Xue, D. Li, “Graph-based spatial- temporal convolutional network for vehicle trajectory prediction in autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 10, pp. 17654–17665, 2022.

Z. Li, C. Lu, Y. Yi, J. Gong, “A hierarchical framework for interactive behaviour prediction of heterogeneous traffic participants based on graph neural network,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 7, pp. 9102–9114, 2021.

X. Zhou, W. Zhao, A. Wang, C. Wang, S. Zheng, “Spatiotemporal attention-based pedestrian trajectory prediction considering traffic-actor interaction,” IEEE Transactions on Vehicular Technology, vol. 72, no. 1, pp. 297–311, 2022.

Z. Zhou, J. Wang, Y.-H. Li, Y.-K. Huang, “Query-centric trajectory prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 17863–17873.

Y. Yuan, X. Weng, Y. Ou, K. M. Kitani, “Agentformer: Agent-aware transformers for socio-temporal multi- agent forecasting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 9813–9823.

J. Li, H. Ma, Z. Zhang, J. Li, M. Tomizuka, “Spatio- temporal graph dual-attention network for multi-agent prediction and tracking,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 8, pp. 10556–10569, 2021.

H. Caesar, V. Bankiti, A. H. Lang, S. Vora, V. E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, O. Beijbom, “nuscenes: A multimodal dataset for autonomous driving,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11621– 11631.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.

J. Xiang, J. Zhang, Z. Nan, “A fast and map-free model for trajectory prediction in traffics,” arXiv preprint arXiv:2307.09831, 2023.

S. Zhai, W. Talbott, N. Srivastava, C. Huang, H. Goh, R. Zhang, J. Susskind, “An attention free transformer,” arXiv preprint arXiv:2105.14103, 2021.

T. Phan-Minh, E. C. Grigore, F. A. Boulton, O. Beijbom, E. M. Wolff, “Covernet: Multimodal behavior prediction using trajectory sets,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 14074–14083.

Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, S. Savarese, “Social lstm: Human trajectory prediction in crowded spaces,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 961–971.

N. Deo, M. M. Trivedi, “Convolutional social pooling for vehicle trajectory prediction,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 1468–1476.

A. Psalta, V. Tsironis, K. Karantzalos, I. Spyropoulou, “Social pooling with edge convolutions on local connectivity graphs for human trajectory prediction in crowded scenes,” in 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), 2020, pp. 1–6, IEEE.

K. He, X. Zhang, S. Ren, J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.

X. Chen, S. Liu, R. W. Liu, H. Wu, B. Han, J. Zhao, “Quantifying arctic oil spilling event risk by integrating an analytic network process and a fuzzy comprehensive evaluation model,” Ocean & Coastal Management, vol. 228, p. 106326, 2022.

X. Zeng, M. Gao, Z. He, Y. Yang, “Trajectory prediction for surrounding traffic participants via local perception and attentive map encoding,” in 2023 IEEE International Conference on Robotics and Biomimetics (ROBIO), 2023, pp. 1–6, IEEE.

C. Wang, Y. Wang, M. Xu, D. J. Crandall, “Stepwise goal- driven networks for trajectory prediction,” IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2716–2723, 2022.

P. Xu, J.-B. Hayet, I. Karamouzas, “Context-aware timewise vaes for realtime vehicle trajectory prediction,” IEEE Robotics and Automation Letters, vol. 8, no. 9, pp. 5440–5447, 2023.