Fig2: GoalFlow consists of three modules. The Perception Module is responsible for integrating scene information into a BEV feature \( F_{\text{bev}} \), the Goal Point Construction Module selects the optimal goal point from Goal Point Vocabulary \( \mathbb{V} \) as guidance information, and the Trajectory Planning Module generates the trajectories by denoising from the Gaussian distribution to the target distribution. Finally, the Trajectory Scorer selects the optimal trajectory from the candidates.
Fig 3: (a) shows the detailed structure of the Goal Point Construction Module, and (b) presents the score distributions of \( \{ \hat{\delta}^{dis}_i \}^N \), \( \{ \hat{\delta}^{dac}_i \}^N \), and \( \{ \hat{\delta}^{final}_i \}^N \), where points with higher scores are highlighted with warmer color.
Fig 4: \( \times \) indicates that the trajectory results in a collision or goes beyond the drivable area, while ✔ represents a safe trajectory. The orange points are generated by the Goal Constructor, while the blue and yellow points correspond to samples from the vocabulary. The results highlight that GoalFlow generates higher-quality trajectories compared to the other two methods.
@article{xing2025goalflow,
title={GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving},
author={Xing, Zebin and Zhang, Xingyu and Hu, Yang and Jiang, Bo and He, Tong and Zhang, Qian and Long, Xiaoxiao and Yin, Wei},
journal={arXiv preprint arXiv:2503.05689},
year={2025}}