A large-scale video-text dataset of high-resolution videos annotated with dense and detailed captions.