Spaces:

tomofi
/

MMOCR

Runtime error

App Files Files Community

MMOCR / docs /en /datasets /det.md

tomofi

Add application file

2366e36 over 2 years ago

preview code

raw

history blame

No virus

15.6 kB


	# Text Detection

	## Overview

	The structure of the text detection dataset directory is organized as follows.

	```text
	├── ctw1500
	│ ├── annotations
	│ ├── imgs
	│ ├── instances_test.json
	│ └── instances_training.json
	├── icdar2015
	│ ├── imgs
	│ ├── instances_test.json
	│ └── instances_training.json
	├── icdar2017
	│ ├── imgs
	│ ├── instances_training.json
	│ └── instances_val.json
	├── synthtext
	│ ├── imgs
	│ └── instances_training.lmdb
	│ ├── data.mdb
	│ └── lock.mdb
	├── textocr
	│ ├── train
	│ ├── instances_training.json
	│ └── instances_val.json
	├── totaltext
	│ ├── imgs
	│ ├── instances_test.json
	│ └── instances_training.json
	├── CurvedSynText150k
	│ ├── syntext_word_eng
	│ ├── emcs_imgs
	│ └── instances_training.json
	\|── funsd
	\| ├── annotations
	│ ├── imgs
	│ ├── instances_test.json
	│ └── instances_training.json
	```

	\| Dataset \| Images \| \| Annotation Files \| \| \|
	\| :---------------: \| :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: \| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: \| :------------------------------------------------------------------------------------------: \| :--------------------------------------------------------------------------------------------: \| :---: \|
	\| \| \| training \| validation \| testing \| \|
	\| CTW1500 \| [homepage](https://github.com/Yuliang-Liu/Curve-Text-Detector) \| - \| - \| - \|
	\| ICDAR2015 \| [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads) \| [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) \| - \| [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) \|
	\| ICDAR2017 \| [homepage](https://rrc.cvc.uab.es/?ch=8&com=downloads) \| [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_training.json) \| [instances_val.json](https://download.openmmlab.com/mmocr/data/icdar2017/instances_val.json) \| - \| \| \|
	\| Synthtext \| [homepage](https://www.robots.ox.ac.uk/~vgg/data/scenetext/) \| instances_training.lmdb ([data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb), [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb)) \| - \| - \|
	\| TextOCR \| [homepage](https://textvqa.org/textocr/dataset) \| - \| - \| - \|
	\| Totaltext \| [homepage](https://github.com/cs-chan/Total-Text-Dataset) \| - \| - \| - \|
	\| CurvedSynText150k \| [homepage](https://github.com/aim-uofa/AdelaiDet/blob/master/datasets/README.md) \\| [Part1](https://drive.google.com/file/d/1OSJ-zId2h3t_-I7g_wUkrK-VqQy153Kj/view?usp=sharing) \\| [Part2](https://drive.google.com/file/d/1EzkcOlIgEp5wmEubvHb7-J5EImHExYgY/view?usp=sharing) \| [instances_training.json](https://download.openmmlab.com/mmocr/data/curvedsyntext/instances_training.json) \| - \| - \|
	\| FUNSD \| [homepage](https://guillaumejaume.github.io/FUNSD/) \| - \| - \| - \|


	## Important Note

	:::{note}
	For users who want to train models on CTW1500, ICDAR 2015/2017, and Totaltext dataset, there might be some images containing orientation info in EXIF data. The default OpenCV
	backend used in MMCV would read them and apply the rotation on the images. However, their gold annotations are made on the raw pixels, and such
	inconsistency results in false examples in the training set. Therefore, users should use `dict(type='LoadImageFromFile', color_type='color_ignore_orientation')` in pipelines to change MMCV's default loading behaviour. (see [DBNet's pipeline config](https://github.com/open-mmlab/mmocr/blob/main/configs/_base_/det_pipelines/dbnet_pipeline.py) for example)
	:::

	## Preparation Steps
	### ICDAR 2015
	- Step0: Read [Important Note](#important-note)
	- Step1: Download `ch4_training_images.zip`, `ch4_test_images.zip`, `ch4_training_localization_transcription_gt.zip`, `Challenge4_Test_Task1_GT.zip` from [homepage](https://rrc.cvc.uab.es/?ch=4&com=downloads)
	- Step2:
	```bash
	mkdir icdar2015 && cd icdar2015
	mkdir imgs && mkdir annotations
	# For images,
	mv ch4_training_images imgs/training
	mv ch4_test_images imgs/test
	# For annotations,
	mv ch4_training_localization_transcription_gt annotations/training
	mv Challenge4_Test_Task1_GT annotations/test
	```
	- Step3: Download [instances_training.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_training.json) and [instances_test.json](https://download.openmmlab.com/mmocr/data/icdar2015/instances_test.json) and move them to `icdar2015`
	- Or, generate `instances_training.json` and `instances_test.json` with following command:
	```bash
	python tools/data/textdet/icdar_converter.py /path/to/icdar2015 -o /path/to/icdar2015 -d icdar2015 --split-list training test
	```

	### ICDAR 2017
	- Follow similar steps as [ICDAR 2015](#icdar-2015).

	### CTW1500
	- Step0: Read [Important Note](#important-note)
	- Step1: Download `train_images.zip`, `test_images.zip`, `train_labels.zip`, `test_labels.zip` from [github](https://github.com/Yuliang-Liu/Curve-Text-Detector)
	```bash
	mkdir ctw1500 && cd ctw1500
	mkdir imgs && mkdir annotations

	# For annotations
	cd annotations
	wget -O train_labels.zip https://universityofadelaide.box.com/shared/static/jikuazluzyj4lq6umzei7m2ppmt3afyw.zip
	wget -O test_labels.zip https://cloudstor.aarnet.edu.au/plus/s/uoeFl0pCN9BOCN5/download
	unzip train_labels.zip && mv ctw1500_train_labels training
	unzip test_labels.zip -d test
	cd ..
	# For images
	cd imgs
	wget -O train_images.zip https://universityofadelaide.box.com/shared/static/py5uwlfyyytbb2pxzq9czvu6fuqbjdh8.zip
	wget -O test_images.zip https://universityofadelaide.box.com/shared/static/t4w48ofnqkdw7jyc4t11nsukoeqk9c3d.zip
	unzip train_images.zip && mv train_images training
	unzip test_images.zip && mv test_images test
	```
	- Step2: Generate `instances_training.json` and `instances_test.json` with following command:

	```bash
	python tools/data/textdet/ctw1500_converter.py /path/to/ctw1500 -o /path/to/ctw1500 --split-list training test
	```

	### SynthText

	- Download [data.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/data.mdb) and [lock.mdb](https://download.openmmlab.com/mmocr/data/synthtext/instances_training.lmdb/lock.mdb) to `synthtext/instances_training.lmdb/`.

	### TextOCR
	- Step1: Download [train_val_images.zip](https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip), [TextOCR_0.1_train.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json) and [TextOCR_0.1_val.json](https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json) to `textocr/`.
	```bash
	mkdir textocr && cd textocr

	# Download TextOCR dataset
	wget https://dl.fbaipublicfiles.com/textvqa/images/train_val_images.zip
	wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_train.json
	wget https://dl.fbaipublicfiles.com/textvqa/data/textocr/TextOCR_0.1_val.json

	# For images
	unzip -q train_val_images.zip
	mv train_images train
	```
	- Step2: Generate `instances_training.json` and `instances_val.json` with the following command:
	```bash
	python tools/data/textdet/textocr_converter.py /path/to/textocr
	```
	### Totaltext
	- Step0: Read [Important Note](#important-note)
	- Step1: Download `totaltext.zip` from [github dataset](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Dataset) and `groundtruth_text.zip` from [github Groundtruth](https://github.com/cs-chan/Total-Text-Dataset/tree/master/Groundtruth/Text) (Our totaltext_converter.py supports groundtruth with both .mat and .txt format).
	```bash
	mkdir totaltext && cd totaltext
	mkdir imgs && mkdir annotations

	# For images
	# in ./totaltext
	unzip totaltext.zip
	mv Images/Train imgs/training
	mv Images/Test imgs/test

	# For annotations
	unzip groundtruth_text.zip
	cd Groundtruth
	mv Polygon/Train ../annotations/training
	mv Polygon/Test ../annotations/test

	```
	- Step2: Generate `instances_training.json` and `instances_test.json` with the following command:
	```bash
	python tools/data/textdet/totaltext_converter.py /path/to/totaltext -o /path/to/totaltext --split-list training test
	```

	### CurvedSynText150k

	- Step1: Download [syntext1.zip](https://drive.google.com/file/d/1OSJ-zId2h3t_-I7g_wUkrK-VqQy153Kj/view?usp=sharing) and [syntext2.zip](https://drive.google.com/file/d/1EzkcOlIgEp5wmEubvHb7-J5EImHExYgY/view?usp=sharing) to `CurvedSynText150k/`.
	- Step2:

	```bash
	unzip -q syntext1.zip
	mv train.json train1.json
	unzip images.zip
	rm images.zip

	unzip -q syntext2.zip
	mv train.json train2.json
	unzip images.zip
	rm images.zip
	```

	- Step3: Download [instances_training.json](https://download.openmmlab.com/mmocr/data/curvedsyntext/instances_training.json) to `CurvedSynText150k/`
	- Or, generate `instances_training.json` with following command:

	```bash
	python tools/data/common/curvedsyntext_converter.py PATH/TO/CurvedSynText150k --nproc 4
	```

	### FUNSD

	- Step1: Download [dataset.zip](https://guillaumejaume.github.io/FUNSD/dataset.zip) to `funsd/`.

	```bash
	mkdir funsd && cd funsd

	# Download FUNSD dataset
	wget https://guillaumejaume.github.io/FUNSD/dataset.zip
	unzip -q dataset.zip

	# For images
	mv dataset/training_data/images imgs && mv dataset/testing_data/images/* imgs/

	# For annotations
	mkdir annotations
	mv dataset/training_data/annotations annotations/training && mv dataset/testing_data/annotations annotations/test

	rm dataset.zip && rm -rf dataset
	```

	- Step2: Generate `instances_training.json` and `instances_test.json` with following command:

	```bash
	python tools/data/textdet/funsd_converter.py PATH/TO/funsd --nproc 4
	```