Abstract: In resource-constrained vehicle systems, establishing consistency between multi-view scenes and driver gaze remains challenging. Prior methods mainly focus on cross-source data fusion, ...