doc: add broken detection
Browse files
README.md
CHANGED
@@ -44,6 +44,18 @@ The generated dataset will be saved in the `dataset/font_img` directory.
|
|
44 |
|
45 |
Note that `batch_generate_script_cmd_32.bat` and `batch_generate_script_cmd_64.bat` are batch scripts for Windows that can be used to generate the dataset in parallel with 32 partitions and 64 partitions.
|
46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
### (Optional) Linux Cluster Generation Walkthrough
|
48 |
|
49 |
If you would like to run the generation script on linux clusters, we also provides the environment setup script `linux_venv_setup.sh`.
|
|
|
44 |
|
45 |
Note that `batch_generate_script_cmd_32.bat` and `batch_generate_script_cmd_64.bat` are batch scripts for Windows that can be used to generate the dataset in parallel with 32 partitions and 64 partitions.
|
46 |
|
47 |
+
### Final Check
|
48 |
+
|
49 |
+
Since the task might be terminated unexpectedly or deliberately by user. The script has a caching mechanism to avoid re-generating the same image.
|
50 |
+
|
51 |
+
In this case, the script might not be able to detect corruption in cache (might be caused by terminating when writing to files) during this task, thus we also provides a script checking the generated dataset and remove the corrupted images and labels.
|
52 |
+
|
53 |
+
```bash
|
54 |
+
python font_ds_detect_broken.py
|
55 |
+
```
|
56 |
+
|
57 |
+
After running the script, you might want to rerun the generation script to fill up the holes of the removed corrupted files.
|
58 |
+
|
59 |
### (Optional) Linux Cluster Generation Walkthrough
|
60 |
|
61 |
If you would like to run the generation script on linux clusters, we also provides the environment setup script `linux_venv_setup.sh`.
|