Cityscapes Dataset
@inproceedings{Cordts2016Cityscapes,
title={The Cityscapes Dataset for Semantic Urban Scene Understanding},
author={Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler, Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},
booktitle={Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2016}
}
Common settings
- All baselines were trained using 8 GPU with a batch size of 8 (1 images per GPU) using the linear scaling rule to scale the learning rate.
- All models were trained on
cityscapes_train
, and tested on cityscapes_val
.
- 1x training schedule indicates 64 epochs which corresponds to slightly less than the 24k iterations reported in the original schedule from the Mask R-CNN paper
- COCO pre-trained weights are used to initialize.
- A conversion script is provided to convert Cityscapes into COCO format. Please refer to install.md for details.
CityscapesDataset
implemented three evaluation methods. bbox
and segm
are standard COCO bbox/mask AP. cityscapes
is the cityscapes dataset official evaluation, which may be slightly higher than COCO.
Faster R-CNN
Backbone |
Style |
Lr schd |
Scale |
Mem (GB) |
Inf time (fps) |
box AP |
Config |
Download |
R-50-FPN |
pytorch |
1x |
800-1024 |
5.2 |
- |
40.3 |
config |
model | log |
Mask R-CNN
Backbone |
Style |
Lr schd |
Scale |
Mem (GB) |
Inf time (fps) |
box AP |
mask AP |
Config |
Download |
R-50-FPN |
pytorch |
1x |
800-1024 |
5.3 |
- |
40.9 |
36.4 |
config |
model | log |