GRiT / third_party /CenterNet2 /datasets /README.md

Upload 1797 files

a567fa4 about 1 year ago

4.65 kB

	# Use Builtin Datasets

	A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog)
	for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc).
	This document explains how to setup the builtin datasets so they can be used by the above APIs.
	[Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`,
	and how to add new datasets to them.

	Detectron2 has builtin support for a few datasets.
	The datasets are assumed to exist in a directory specified by the environment variable
	`DETECTRON2_DATASETS`.
	Under this directory, detectron2 will look for datasets in the structure described below, if needed.
	```
	$DETECTRON2_DATASETS/
	coco/
	lvis/
	cityscapes/
	VOC20{07,12}/
	```

	You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`.
	If left unset, the default is `./datasets` relative to your current working directory.

	The [model zoo](https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md)
	contains configs and models that use these builtin datasets.

	## Expected dataset structure for [COCO instance/keypoint detection](https://cocodataset.org/#download):

	```
	coco/
	annotations/
	instances_{train,val}2017.json
	person_keypoints_{train,val}2017.json
	{train,val}2017/
	# image files that are mentioned in the corresponding json
	```

	You can use the 2014 version of the dataset as well.

	Some of the builtin tests (`dev/run_*_tests.sh`) uses a tiny version of the COCO dataset,
	which you can download with `./datasets/prepare_for_tests.sh`.

	## Expected dataset structure for PanopticFPN:

	Extract panoptic annotations from [COCO website](https://cocodataset.org/#download)
	into the following structure:
	```
	coco/
	annotations/
	panoptic_{train,val}2017.json
	panoptic_{train,val}2017/ # png annotations
	panoptic_stuff_{train,val}2017/ # generated by the script mentioned below
	```

	Install panopticapi by:
	```
	pip install git+https://github.com/cocodataset/panopticapi.git
	```
	Then, run `python datasets/prepare_panoptic_fpn.py`, to extract semantic annotations from panoptic annotations.

	## Expected dataset structure for [LVIS instance segmentation](https://www.lvisdataset.org/dataset):
	```
	coco/
	{train,val,test}2017/
	lvis/
	lvis_v0.5_{train,val}.json
	lvis_v0.5_image_info_test.json
	lvis_v1_{train,val}.json
	lvis_v1_image_info_test{,_challenge}.json
	```

	Install lvis-api by:
	```
	pip install git+https://github.com/lvis-dataset/lvis-api.git
	```

	To evaluate models trained on the COCO dataset using LVIS annotations,
	run `python datasets/prepare_cocofied_lvis.py` to prepare "cocofied" LVIS annotations.

	## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/):
	```
	cityscapes/
	gtFine/
	train/
	aachen/
	color.png, instanceIds.png, labelIds.png, polygons.json,
	labelTrainIds.png
	...
	val/
	test/
	# below are generated Cityscapes panoptic annotation
	cityscapes_panoptic_train.json
	cityscapes_panoptic_train/
	cityscapes_panoptic_val.json
	cityscapes_panoptic_val/
	cityscapes_panoptic_test.json
	cityscapes_panoptic_test/
	leftImg8bit/
	train/
	val/
	test/
	```
	Install cityscapes scripts by:
	```
	pip install git+https://github.com/mcordts/cityscapesScripts.git
	```

	Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with:
	```
	CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py
	```
	These files are not needed for instance segmentation.

	Note: to generate Cityscapes panoptic dataset, run cityscapesescript with:
	```
	CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py
	```
	These files are not needed for semantic and instance segmentation.

	## Expected dataset structure for [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/index.html):
	```
	VOC20{07,12}/
	Annotations/
	ImageSets/
	Main/
	trainval.txt
	test.txt
	# train.txt or val.txt, if you use these splits
	JPEGImages/
	```

	## Expected dataset structure for [ADE20k Scene Parsing](http://sceneparsing.csail.mit.edu/):
	```
	ADEChallengeData2016/
	annotations/
	annotations_detectron2/
	images/
	objectInfo150.txt
	```
	The directory `annotations_detectron2` is generated by running `python datasets/prepare_ade20k_sem_seg.py`.