metadata
license: apache-2.0
pipeline_tag: image-to-video
tags:
- autonomous driving
- video generation
- world model
Model Card for Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Brief Introduction
Vista is a generalizable driving world model that is capable of:
- High-Fidelity Future Prediction: Predict high-fidelity futures in various scenarios.
- Coherent Long-Horizon Rollout: Extend its predictions to continuous and long horizons.
- Versatile Action Controllability: Execute multi-modal actions (steering angles, speeds, commands, trajectories, goal points).
- Generalizable Reward Function: Provide rewards for different actions without accessing ground truth actions.
Related Links
For more technical details and discussions, please refer to:
- Paper: https://arxiv.org/abs/2405.17398
- Code: https://github.com/OpenDriveLab/Vista
- Demo: https://vista-demo.github.io
How to Use
Check out https://github.com/OpenDriveLab/Vista
Citation
@article{gao2024vista,
title={Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability},
author={Shenyuan Gao and Jiazhi Yang and Li Chen and Kashyap Chitta and Yihang Qiu and Andreas Geiger and Jun Zhang and Hongyang Li},
journal={arXiv preprint arXiv:2405.17398},
year={2024}
}
@inproceedings{yang2024genad,
title={Generalized Predictive Model for Autonomous Driving},
author={Jiazhi Yang and Shenyuan Gao and Yihang Qiu and Li Chen and Tianyu Li and Bo Dai and Kashyap Chitta and Penghao Wu and Jia Zeng and Ping Luo and Jun Zhang and Andreas Geiger and Yu Qiao and Hongyang Li},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}
Contact
If you have any questions or comments, feel free to leave a message to sygao@connect.ust.hk