+ Model |
+ Config |
+ Metric |
+ PyTorch |
+ ONNX Runtime |
+ FCOS |
+ configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py |
+ Box AP |
+ 36.6 |
+ 36.5 |
+ FSAF |
+ configs/fsaf/fsaf_r50_fpn_1x_coco.py |
+ Box AP |
+ 36.0 |
+ 36.0 |
+ RetinaNet |
+ configs/retinanet/retinanet_r50_fpn_1x_coco.py |
+ Box AP |
+ 36.5 |
+ 36.4 |
+ SSD |
+ configs/ssd/ssd300_coco.py |
+ Box AP |
+ 25.6 |
+ 25.6 |
+ YOLOv3 |
+ configs/yolo/yolov3_d53_mstrain-608_273e_coco.py |
+ Box AP |
+ 33.5 |
+ 33.5 |
+ Faster R-CNN |
+ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py |
+ Box AP |
+ 37.4 |
+ 37.4 |
+ Mask R-CNN |
+ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py |
+ Box AP |
+ 38.2 |
+ 38.1 |
+ Mask AP |
+ 34.7 |
+ 33.7 |
+- All ONNX models are evaluated with dynamic shape on coco dataset and images are preprocessed according to the original config file.
+- Mask AP of Mask R-CNN drops by 1% for ONNXRuntime. The main reason is that the predicted masks are directly interpolated to original image in PyTorch, while they are at first interpolated to the preprocessed input image of the model and then to original image in ONNXRuntime.
+## List of supported models exportable to ONNX
+The table below lists the models that are guaranteed to be exportable to ONNX and runnable in ONNX Runtime.
+| Model | Config | Dynamic Shape | Batch Inference | Note |
+| :----------: | :------------------------------------------------------: | :-----------: | :-------------: | :---: |
+| FCOS | `configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py` | Y | Y | |
+| FSAF | `configs/fsaf/fsaf_r50_fpn_1x_coco.py` | Y | Y | |
+| RetinaNet | `configs/retinanet/retinanet_r50_fpn_1x_coco.py` | Y | Y | |
+| SSD | `configs/ssd/ssd300_coco.py` | Y | Y | |
+| YOLOv3 | `configs/yolo/yolov3_d53_mstrain-608_273e_coco.py` | Y | Y | |
+| Faster R-CNN | `configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py` | Y | Y | |
+| Mask R-CNN | `configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py` | Y | Y | |
+- *All models above are tested with Pytorch==1.6.0 and onnxruntime==1.5.1*
+- If the deployed backend platform is TensorRT, please add environment variables before running the file:
+ ```bash
+ ```
+- If you want to use the `--dynamic-export` parameter in the TensorRT backend to export ONNX, please remove the `--simplify` parameter, and vice versa.
+## The Parameters of Non-Maximum Suppression in ONNX Export
+In the process of exporting the ONNX model, we set some parameters for the NMS op to control the number of output bounding boxes. The following will introduce the parameter setting of the NMS op in the supported models. You can set these parameters through `--cfg-options`.
+- `nms_pre`: The number of boxes before NMS. The default setting is `1000`.
+- `deploy_nms_pre`: The number of boxes before NMS when exporting to ONNX model. The default setting is `0`.
+- `max_per_img`: The number of boxes to be kept after NMS. The default setting is `100`.
+- `max_output_boxes_per_class`: Maximum number of output boxes per class of NMS. The default setting is `200`.
+## Reminders
+- When the input model has custom op such as `RoIAlign` and if you want to verify the exported ONNX model, you may have to build `mmcv` with [ONNXRuntime](https://mmcv.readthedocs.io/en/latest/onnxruntime_op.html) from source.
+- `mmcv.onnx.simplify` feature is based on [onnx-simplifier](https://github.com/daquexian/onnx-simplifier). If you want to try it, please refer to [onnx in `mmcv`](https://mmcv.readthedocs.io/en/latest/onnx.html) and [onnxruntime op in `mmcv`](https://mmcv.readthedocs.io/en/latest/onnxruntime_op.html) for more information.
+- If you meet any problem with the listed models above, please create an issue and it would be taken care of soon. For models not included in the list, please try to dig a little deeper and debug a little bit more and hopefully solve them by yourself.
+- Because this feature is experimental and may change fast, please always try with the latest `mmcv` and `mmdetecion`.
+## FAQs
+- None
+Apart from training/testing scripts, We provide lots of useful tools under the
+ `tools/` directory.
+## Log Analysis
+`tools/analysis_tools/analyze_logs.py` plots loss/mAP curves given a training
+ log file. Run `pip install seaborn` first to install the dependency.
+ ```shell
+python tools/analysis_tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title ${TITLE}] [--legend ${LEGEND}] [--backend ${BACKEND}] [--style ${STYLE}] [--out ${OUT_FILE}]
+ ```
+![loss curve image](../resources/loss_curve.png)
+- Plot the classification loss of some run.
+ ```shell
+ python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls --legend loss_cls
+ ```
+- Plot the classification and regression loss of some run, and save the figure to a pdf.
+ ```shell
+ python tools/analysis_tools/analyze_logs.py plot_curve log.json --keys loss_cls loss_bbox --out losses.pdf
+ ```
+- Compare the bbox mAP of two runs in the same figure.
+ ```shell
+ python tools/analysis_tools/analyze_logs.py plot_curve log1.json log2.json --keys bbox_mAP --legend run1 run2
+ ```
+- Compute the average training speed.
+ ```shell
+ python tools/analysis_tools/analyze_logs.py cal_train_time log.json [--include-outliers]
+ ```
+ The output is expected to be like the following.
+ ```text
+ -----Analyze train time of work_dirs/some_exp/20190611_192040.log.json-----
+ slowest epoch 11, average time is 1.2024
+ fastest epoch 1, average time is 1.1909
+ time std over epochs is 0.0028
+ average iter time: 1.1959 s/iter
+ ```
+## Result Analysis
+`tools/analysis_tools/analyze_results.py` calculates single image mAP and saves or shows the topk images with the highest and lowest scores based on prediction results.
+python tools/analysis_tools/analyze_results.py \
+ ${CONFIG} \
+ ${SHOW_DIR} \
+ [--show] \
+ [--wait-time ${WAIT_TIME}] \
+ [--topk ${TOPK}] \
+ [--show-score-thr ${SHOW_SCORE_THR}] \
+ [--cfg-options ${CFG_OPTIONS}]
+Description of all arguments:
+- `config` : The path of a model config file.
+- `prediction_path`: Output result file in pickle format from `tools/test.py`
+- `show_dir`: Directory where painted GT and detection images will be saved
+- `--show`:Determines whether to show painted images, If not specified, it will be set to `False`
+- `--wait-time`: The interval of show (s), 0 is block
+- `--topk`: The number of saved images that have the highest and lowest `topk` scores after sorting. If not specified, it will be set to `20`.
+- `--show-score-thr`: Show score threshold. If not specified, it will be set to `0`.
+- `--cfg-options`: If specified, the key-value pair optional cfg will be merged into config file
+Assume that you have got result file in pickle format from `tools/test.py` in the path './result.pkl'.
+1. Test Faster R-CNN and visualize the results, save images to the directory `results/`
+python tools/analysis_tools/analyze_results.py \
+ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
+ result.pkl \
+ results \
+ --show
+2. Test Faster R-CNN and specified topk to 50, save images to the directory `results/`
+python tools/analysis_tools/analyze_results.py \
+ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
+ result.pkl \
+ results \
+ --topk 50
+3. If you want to filter the low score prediction results, you can specify the `show-score-thr` parameter
+python tools/analysis_tools/analyze_results.py \
+ configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
+ result.pkl \
+ results \
+ --show-score-thr 0.3
+## Visualization
+### Visualize Datasets
+`tools/misc/browse_dataset.py` helps the user to browse a detection dataset (both
+ images and bounding box annotations) visually, or save the image to a
+ designated directory.
+python tools/misc/browse_dataset.py ${CONFIG} [-h] [--skip-type ${SKIP_TYPE[SKIP_TYPE...]}] [--output-dir ${OUTPUT_DIR}] [--not-show] [--show-interval ${SHOW_INTERVAL}]
+### Visualize Models
+First, convert the model to ONNX as described
+Note that currently only RetinaNet is supported, support for other models
+ will be coming in later versions.
+The converted model could be visualized by tools like [Netron](https://github.com/lutzroeder/netron).
+### Visualize Predictions
+If you need a lightweight GUI for visualizing the detection results, you can refer [DetVisGUI project](https://github.com/Chien-Hung/DetVisGUI/tree/mmdetection).
+## Error Analysis
+`tools/analysis_tools/coco_error_analysis.py` analyzes COCO results per category and by
+ different criterion. It can also make a plot to provide useful information.
+python tools/analysis_tools/coco_error_analysis.py ${RESULT} ${OUT_DIR} [-h] [--ann ${ANN}] [--types ${TYPES[TYPES...]}]
+Assume that you have got [Mask R-CNN checkpoint file](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth) in the path 'checkpoint'. For other checkpoints, please refer to our [model zoo](./model_zoo.md). You can use the following command to get the results bbox and segmentation json file.
+# out: results.bbox.json and results.segm.json
+python tools/test.py \
+ configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
+ checkpoint/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth \
+ --format-only \
+ --options "jsonfile_prefix=./results"
+1. Get COCO bbox error results per category , save analyze result images to the directory `results/`
+python tools/analysis_tools/coco_error_analysis.py \
+ results.bbox.json \
+ results \
+ --ann=data/coco/annotations/instances_val2017.json \
+2. Get COCO segmentation error results per category , save analyze result images to the directory `results/`
+python tools/analysis_tools/coco_error_analysis.py \
+ results.segm.json \
+ results \
+ --ann=data/coco/annotations/instances_val2017.json \
+ --types='segm'
+## Model Serving
+In order to serve an `MMDetection` model with [`TorchServe`](https://pytorch.org/serve/), you can follow the steps:
+### 1. Convert model from MMDetection to TorchServe
+python tools/deployment/mmdet2torchserve.py ${CONFIG_FILE} ${CHECKPOINT_FILE} \
+--output-folder ${MODEL_STORE} \
+--model-name ${MODEL_NAME}
+***Note**: ${MODEL_STORE} needs to be an absolute path to a folder.
+### 2. Build `mmdet-serve` docker image
+docker build -t mmdet-serve:latest docker/serve/
+### 3. Run `mmdet-serve`
+Check the official docs for [running TorchServe with docker](https://github.com/pytorch/serve/blob/master/docker/README.md#running-torchserve-in-a-production-docker-environment).
+In order to run in GPU, you need to install [nvidia-docker](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). You can omit the `--gpus` argument in order to run in CPU.
+docker run --rm \
+--cpus 8 \
+--gpus device=0 \
+-p8080:8080 -p8081:8081 -p8082:8082 \
+--mount type=bind,source=$MODEL_STORE,target=/home/model-server/model-store \
+[Read the docs](https://github.com/pytorch/serve/blob/072f5d088cce9bb64b2a18af065886c9b01b317b/docs/rest_api.md) about the Inference (8080), Management (8081) and Metrics (8082) APis
+### 4. Test deployment
+curl -O curl -O https://raw.githubusercontent.com/pytorch/serve/master/docs/images/3dogs.jpg
+curl${MODEL_NAME} -T 3dogs.jpg
+You should obtain a respose similar to:
+ {
+ "dog": [
+ 402.9117736816406,
+ 124.19664001464844,
+ 571.7910766601562,
+ 292.6463623046875
+ ],
+ "score": 0.9561963081359863
+ },
+ {
+ "dog": [
+ 293.90057373046875,
+ 196.2908477783203,
+ 417.4869079589844,
+ 286.2522277832031
+ ],
+ "score": 0.9179860353469849
+ },
+ {
+ "dog": [
+ 202.178466796875,
+ 86.3709487915039,
+ 311.9863586425781,
+ 276.28411865234375
+ ],
+ "score": 0.8933767080307007
+ }
+## Model Complexity
+`tools/analysis_tools/get_flops.py` is a script adapted from [flops-counter.pytorch](https://github.com/sovrasov/flops-counter.pytorch) to compute the FLOPs and params of a given model.
+python tools/analysis_tools/get_flops.py ${CONFIG_FILE} [--shape ${INPUT_SHAPE}]
+You will get the results like this.
+Input shape: (3, 1280, 800)
+Flops: 239.32 GFLOPs
+Params: 37.74 M
+**Note**: This tool is still experimental and we do not guarantee that the
+ number is absolutely correct. You may well use the result for simple
+ comparisons, but double check it before you adopt it in technical reports or papers.
+1. FLOPs are related to the input shape while parameters are not. The default
+ input shape is (1, 3, 1280, 800).
+2. Some operators are not counted into FLOPs like GN and custom operators. Refer to [`mmcv.cnn.get_model_complexity_info()`](https://github.com/open-mmlab/mmcv/blob/master/mmcv/cnn/utils/flops_counter.py) for details.
+3. The FLOPs of two-stage detectors is dependent on the number of proposals.
+## Model conversion
+### MMDetection model to ONNX (experimental)
+We provide a script to convert model to [ONNX](https://github.com/onnx/onnx) format. We also support comparing the output results between Pytorch and ONNX model for verification.
+python tools/deployment/pytorch2onnx.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --output_file ${ONNX_FILE} [--shape ${INPUT_SHAPE} --verify]
+**Note**: This tool is still experimental. Some customized operators are not supported for now. For a detailed description of the usage and the list of supported models, please refer to [pytorch2onnx](tutorials/pytorch2onnx.md).
+### MMDetection 1.x model to MMDetection 2.x
+`tools/model_converters/upgrade_model_version.py` upgrades a previous MMDetection checkpoint
+ to the new version. Note that this script is not guaranteed to work as some
+ breaking changes are introduced in the new version. It is recommended to
+ directly use the new checkpoints.
+python tools/model_converters/upgrade_model_version.py ${IN_FILE} ${OUT_FILE} [-h] [--num-classes NUM_CLASSES]
+### RegNet model to MMDetection
+`tools/model_converters/regnet2mmdet.py` convert keys in pycls pretrained RegNet models to
+ MMDetection style.
+python tools/model_converters/regnet2mmdet.py ${SRC} ${DST} [-h]
+### Detectron ResNet to Pytorch
+`tools/model_converters/detectron2pytorch.py` converts keys in the original detectron pretrained
+ ResNet models to PyTorch style.
+python tools/model_converters/detectron2pytorch.py ${SRC} ${DST} ${DEPTH} [-h]
+### Prepare a model for publishing
+`tools/model_converters/publish_model.py` helps users to prepare their model for publishing.
+Before you upload a model to AWS, you may want to
+1. convert model weights to CPU tensors
+2. delete the optimizer states and
+3. compute the hash of the checkpoint file and append the hash id to the
+ filename.
+python tools/model_converters/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
+python tools/model_converters/publish_model.py work_dirs/faster_rcnn/latest.pth faster_rcnn_r50_fpn_1x_20190801.pth
+The final output filename will be `faster_rcnn_r50_fpn_1x_20190801-{hash id}.pth`.
+## Dataset Conversion
+`tools/data_converters/` contains tools to convert the Cityscapes dataset
+ and Pascal VOC dataset to the COCO format.
+python tools/dataset_converters/cityscapes.py ${CITYSCAPES_PATH} [-h] [--img-dir ${IMG_DIR}] [--gt-dir ${GT_DIR}] [-o ${OUT_DIR}] [--nproc ${NPROC}]
+python tools/dataset_converters/pascal_voc.py ${DEVKIT_PATH} [-h] [-o ${OUT_DIR}]
+## Robust Detection Benchmark
+`tools/analysis_tools/test_robustness.py` and`tools/analysis_tools/robustness_eval.py` helps users to evaluate model robustness. The core idea comes from [Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming](https://arxiv.org/abs/1907.07484). For more information how to evaluate models on corrupted images and results for a set of standard models please refer to [robustness_benchmarking.md](robustness_benchmarking.md).
+## Miscellaneous
+### Evaluating a metric
+`tools/analysis_tools/eval_metric.py` evaluates certain metrics of a pkl result file
+ according to a config file.
+python tools/analysis_tools/eval_metric.py ${CONFIG} ${PKL_RESULTS} [-h] [--format-only] [--eval ${EVAL[EVAL ...]}]
+ [--cfg-options ${CFG_OPTIONS [CFG_OPTIONS ...]}]
+ [--eval-options ${EVAL_OPTIONS [EVAL_OPTIONS ...]}]
+### Print the entire config
+`tools/misc/print_config.py` prints the whole config verbatim, expanding all its
+ imports.
+python tools/misc/print_config.py ${CONFIG} [-h] [--options ${OPTIONS [OPTIONS...]}]
+import mmcv
+from .version import __version__, short_version
+def digit_version(version_str):
+ digit_version = []
+ for x in version_str.split('.'):
+ if x.isdigit():
+ digit_version.append(int(x))
+ elif x.find('rc') != -1:
+ patch_version = x.split('rc')
+ digit_version.append(int(patch_version[0]) - 1)
+ digit_version.append(int(patch_version[1]))
+ return digit_version
+mmcv_minimum_version = '1.3.2'
+mmcv_maximum_version = '1.4.0'
+mmcv_version = digit_version(mmcv.__version__)
+assert (mmcv_version >= digit_version(mmcv_minimum_version)
+ and mmcv_version <= digit_version(mmcv_maximum_version)), \
+ f'MMCV=={mmcv.__version__} is used but incompatible. ' \
+ f'Please install mmcv>={mmcv_minimum_version}, <={mmcv_maximum_version}.'
+__all__ = ['__version__', 'short_version']
+from .inference import (async_inference_detector, inference_detector,
+ init_detector, show_result_pyplot)
+from .test import multi_gpu_test, single_gpu_test
+from .train import get_root_logger, set_random_seed, train_detector
+__all__ = [
+ 'get_root_logger', 'set_random_seed', 'train_detector', 'init_detector',
+ 'async_inference_detector', 'inference_detector', 'show_result_pyplot',
+ 'multi_gpu_test', 'single_gpu_test'
+import warnings
+import mmcv
+import numpy as np
+import torch
+from mmcv.ops import RoIPool
+from mmcv.parallel import collate, scatter
+from mmcv.runner import load_checkpoint
+from mmdet.core import get_classes
+from mmdet.datasets import replace_ImageToTensor
+from mmdet.datasets.pipelines import Compose
+from mmdet.models import build_detector
+def init_detector(config, checkpoint=None, device='cuda:0', cfg_options=None):
+ """Initialize a detector from config file.
+ Args:
+ config (str or :obj:`mmcv.Config`): Config file path or the config
+ object.
+ checkpoint (str, optional): Checkpoint path. If left as None, the model
+ will not load any weights.
+ cfg_options (dict): Options to override some settings in the used
+ config.
+ Returns:
+ nn.Module: The constructed detector.
+ """
+ if isinstance(config, str):
+ config = mmcv.Config.fromfile(config)
+ elif not isinstance(config, mmcv.Config):
+ raise TypeError('config must be a filename or Config object, '
+ f'but got {type(config)}')
+ if cfg_options is not None:
+ config.merge_from_dict(cfg_options)
+ config.model.pretrained = None
+ config.model.train_cfg = None
+ model = build_detector(config.model, test_cfg=config.get('test_cfg'))
+ if checkpoint is not None:
+ map_loc = 'cpu' if device == 'cpu' else None
+ checkpoint = load_checkpoint(model, checkpoint, map_location=map_loc)
+ if 'CLASSES' in checkpoint.get('meta', {}):
+ model.CLASSES = checkpoint['meta']['CLASSES']
+ else:
+ warnings.simplefilter('once')
+ warnings.warn('Class names are not saved in the checkpoint\'s '
+ 'meta data, use COCO classes by default.')
+ model.CLASSES = get_classes('coco')
+ model.cfg = config # save the config in the model for convenience
+ model.to(device)
+ model.eval()
+ return model
+class LoadImage(object):
+ """Deprecated.
+ A simple pipeline to load image.
+ """
+ def __call__(self, results):
+ """Call function to load images into results.
+ Args:
+ results (dict): A result dict contains the file name
+ of the image to be read.
+ Returns:
+ dict: ``results`` will be returned containing loaded image.
+ """
+ warnings.simplefilter('once')
+ warnings.warn('`LoadImage` is deprecated and will be removed in '
+ 'future releases. You may use `LoadImageFromWebcam` '
+ 'from `mmdet.datasets.pipelines.` instead.')
+ if isinstance(results['img'], str):
+ results['filename'] = results['img']
+ results['ori_filename'] = results['img']
+ else:
+ results['filename'] = None
+ results['ori_filename'] = None
+ img = mmcv.imread(results['img'])
+ results['img'] = img
+ results['img_fields'] = ['img']
+ results['img_shape'] = img.shape
+ results['ori_shape'] = img.shape
+ return results
+def inference_detector(model, imgs):
+ """Inference image(s) with the detector.
+ Args:
+ model (nn.Module): The loaded detector.
+ imgs (str/ndarray or list[str/ndarray] or tuple[str/ndarray]):
+ Either image files or loaded images.
+ Returns:
+ If imgs is a list or tuple, the same length list type results
+ will be returned, otherwise return the detection results directly.
+ """
+ if isinstance(imgs, (list, tuple)):
+ is_batch = True
+ else:
+ imgs = [imgs]
+ is_batch = False
+ cfg = model.cfg
+ device = next(model.parameters()).device # model device
+ if isinstance(imgs[0], np.ndarray):
+ cfg = cfg.copy()
+ # set loading pipeline type
+ cfg.data.test.pipeline[0].type = 'LoadImageFromWebcam'
+ cfg.data.test.pipeline = replace_ImageToTensor(cfg.data.test.pipeline)
+ test_pipeline = Compose(cfg.data.test.pipeline)
+ datas = []
+ for img in imgs:
+ # prepare data
+ if isinstance(img, np.ndarray):
+ # directly add img
+ data = dict(img=img)
+ else:
+ # add information into dict
+ data = dict(img_info=dict(filename=img), img_prefix=None)
+ # build the data pipeline
+ data = test_pipeline(data)
+ datas.append(data)
+ data = collate(datas, samples_per_gpu=len(imgs))
+ # just get the actual data from DataContainer
+ data['img_metas'] = [img_metas.data[0] for img_metas in data['img_metas']]
+ data['img'] = [img.data[0] for img in data['img']]
+ if next(model.parameters()).is_cuda:
+ # scatter to specified GPU
+ data = scatter(data, [device])[0]
+ else:
+ for m in model.modules():
+ assert not isinstance(
+ m, RoIPool
+ ), 'CPU inference with RoIPool is not supported currently.'
+ # forward the model
+ with torch.no_grad():
+ results = model(return_loss=False, rescale=True, **data)
+ if not is_batch:
+ return results[0]
+ else:
+ return results
+async def async_inference_detector(model, imgs):
+ """Async inference image(s) with the detector.
+ Args:
+ model (nn.Module): The loaded detector.
+ img (str | ndarray): Either image files or loaded images.
+ Returns:
+ Awaitable detection results.
+ """
+ if not isinstance(imgs, (list, tuple)):
+ imgs = [imgs]
+ cfg = model.cfg
+ device = next(model.parameters()).device # model device
+ if isinstance(imgs[0], np.ndarray):
+ cfg = cfg.copy()
+ # set loading pipeline type
+ cfg.data.test.pipeline[0].type = 'LoadImageFromWebcam'
+ cfg.data.test.pipeline = replace_ImageToTensor(cfg.data.test.pipeline)
+ test_pipeline = Compose(cfg.data.test.pipeline)
+ datas = []
+ for img in imgs:
+ # prepare data
+ if isinstance(img, np.ndarray):
+ # directly add img
+ data = dict(img=img)
+ else:
+ # add information into dict
+ data = dict(img_info=dict(filename=img), img_prefix=None)
+ # build the data pipeline
+ data = test_pipeline(data)
+ datas.append(data)
+ data = collate(datas, samples_per_gpu=len(imgs))
+ # just get the actual data from DataContainer
+ data['img_metas'] = [img_metas.data[0] for img_metas in data['img_metas']]
+ data['img'] = [img.data[0] for img in data['img']]
+ if next(model.parameters()).is_cuda:
+ # scatter to specified GPU
+ data = scatter(data, [device])[0]
+ else:
+ for m in model.modules():
+ assert not isinstance(
+ m, RoIPool
+ ), 'CPU inference with RoIPool is not supported currently.'
+ # We don't restore `torch.is_grad_enabled()` value during concurrent
+ # inference since execution can overlap
+ torch.set_grad_enabled(False)
+ results = await model.aforward_test(rescale=True, **data)
+ return results
+def show_result_pyplot(model,
+ img,
+ result,
+ score_thr=0.3,
+ title='result',
+ wait_time=0):
+ """Visualize the detection results on the image.
+ Args:
+ model (nn.Module): The loaded detector.
+ img (str or np.ndarray): Image filename or loaded image.
+ result (tuple[list] or list): The detection result, can be either
+ (bbox, segm) or just bbox.
+ score_thr (float): The threshold to visualize the bboxes and masks.
+ title (str): Title of the pyplot figure.
+ wait_time (float): Value of waitKey param.
+ Default: 0.
+ """
+ if hasattr(model, 'module'):
+ model = model.module
+ model.show_result(
+ img,
+ result,
+ score_thr=score_thr,
+ show=True,
+ wait_time=wait_time,
+ win_name=title,
+ bbox_color=(72, 101, 241),
+ text_color=(72, 101, 241))
+import os.path as osp
+import pickle
+import shutil
+import tempfile
+import time
+import mmcv
+import torch
+import torch.distributed as dist
+from mmcv.image import tensor2imgs
+from mmcv.runner import get_dist_info
+from mmdet.core import encode_mask_results
+def single_gpu_test(model,
+ data_loader,
+ show=False,
+ out_dir=None,
+ show_score_thr=0.3):
+ model.eval()
+ results = []
+ dataset = data_loader.dataset
+ prog_bar = mmcv.ProgressBar(len(dataset))
+ for i, data in enumerate(data_loader):
+ with torch.no_grad():
+ result = model(return_loss=False, rescale=True, **data)
+ batch_size = len(result)
+ if show or out_dir:
+ if batch_size == 1 and isinstance(data['img'][0], torch.Tensor):
+ img_tensor = data['img'][0]
+ else:
+ img_tensor = data['img'][0].data[0]
+ img_metas = data['img_metas'][0].data[0]
+ imgs = tensor2imgs(img_tensor, **img_metas[0]['img_norm_cfg'])
+ assert len(imgs) == len(img_metas)
+ for i, (img, img_meta) in enumerate(zip(imgs, img_metas)):
+ h, w, _ = img_meta['img_shape']
+ img_show = img[:h, :w, :]
+ ori_h, ori_w = img_meta['ori_shape'][:-1]
+ img_show = mmcv.imresize(img_show, (ori_w, ori_h))
+ if out_dir:
+ out_file = osp.join(out_dir, img_meta['ori_filename'])
+ else:
+ out_file = None
+ model.module.show_result(
+ img_show,
+ result[i],
+ show=show,
+ out_file=out_file,
+ score_thr=show_score_thr)
+ # encode mask results
+ if isinstance(result[0], tuple):
+ result = [(bbox_results, encode_mask_results(mask_results))
+ for bbox_results, mask_results in result]
+ results.extend(result)
+ for _ in range(batch_size):
+ prog_bar.update()
+ return results
+def multi_gpu_test(model, data_loader, tmpdir=None, gpu_collect=False):
+ """Test model with multiple gpus.
+ This method tests model with multiple gpus and collects the results
+ under two different modes: gpu and cpu modes. By setting 'gpu_collect=True'
+ it encodes results to gpu tensors and use gpu communication for results
+ collection. On cpu mode it saves the results on different gpus to 'tmpdir'
+ and collects them by the rank 0 worker.
+ Args:
+ model (nn.Module): Model to be tested.
+ data_loader (nn.Dataloader): Pytorch data loader.
+ tmpdir (str): Path of directory to save the temporary results from
+ different gpus under cpu mode.
+ gpu_collect (bool): Option to use either gpu or cpu to collect results.
+ Returns:
+ list: The prediction results.
+ """
+ model.eval()
+ results = []
+ dataset = data_loader.dataset
+ rank, world_size = get_dist_info()
+ if rank == 0:
+ prog_bar = mmcv.ProgressBar(len(dataset))
+ time.sleep(2) # This line can prevent deadlock problem in some cases.
+ for i, data in enumerate(data_loader):
+ with torch.no_grad():
+ result = model(return_loss=False, rescale=True, **data)
+ # encode mask results
+ if isinstance(result[0], tuple):
+ result = [(bbox_results, encode_mask_results(mask_results))
+ for bbox_results, mask_results in result]
+ results.extend(result)
+ if rank == 0:
+ batch_size = len(result)
+ for _ in range(batch_size * world_size):
+ prog_bar.update()
+ # collect results from all ranks
+ if gpu_collect:
+ results = collect_results_gpu(results, len(dataset))
+ else:
+ results = collect_results_cpu(results, len(dataset), tmpdir)
+ return results
+def collect_results_cpu(result_part, size, tmpdir=None):
+ rank, world_size = get_dist_info()
+ # create a tmp dir if it is not specified
+ if tmpdir is None:
+ MAX_LEN = 512
+ # 32 is whitespace
+ dir_tensor = torch.full((MAX_LEN, ),
+ 32,
+ dtype=torch.uint8,
+ device='cuda')
+ if rank == 0:
+ mmcv.mkdir_or_exist('.dist_test')
+ tmpdir = tempfile.mkdtemp(dir='.dist_test')
+ tmpdir = torch.tensor(
+ bytearray(tmpdir.encode()), dtype=torch.uint8, device='cuda')
+ dir_tensor[:len(tmpdir)] = tmpdir
+ dist.broadcast(dir_tensor, 0)
+ tmpdir = dir_tensor.cpu().numpy().tobytes().decode().rstrip()
+ else:
+ mmcv.mkdir_or_exist(tmpdir)
+ # dump the part result to the dir
+ mmcv.dump(result_part, osp.join(tmpdir, f'part_{rank}.pkl'))
+ dist.barrier()
+ # collect all parts
+ if rank != 0:
+ return None
+ else:
+ # load results of all parts from tmp dir
+ part_list = []
+ for i in range(world_size):
+ part_file = osp.join(tmpdir, f'part_{i}.pkl')
+ part_list.append(mmcv.load(part_file))
+ # sort the results
+ ordered_results = []
+ for res in zip(*part_list):
+ ordered_results.extend(list(res))
+ # the dataloader may pad some samples
+ ordered_results = ordered_results[:size]
+ # remove tmp dir
+ shutil.rmtree(tmpdir)
+ return ordered_results
+def collect_results_gpu(result_part, size):
+ rank, world_size = get_dist_info()
+ # dump result part to tensor with pickle
+ part_tensor = torch.tensor(
+ bytearray(pickle.dumps(result_part)), dtype=torch.uint8, device='cuda')
+ # gather all result part tensor shape
+ shape_tensor = torch.tensor(part_tensor.shape, device='cuda')
+ shape_list = [shape_tensor.clone() for _ in range(world_size)]
+ dist.all_gather(shape_list, shape_tensor)
+ # padding result part tensor to max length
+ shape_max = torch.tensor(shape_list).max()
+ part_send = torch.zeros(shape_max, dtype=torch.uint8, device='cuda')
+ part_send[:shape_tensor[0]] = part_tensor
+ part_recv_list = [
+ part_tensor.new_zeros(shape_max) for _ in range(world_size)
+ ]
+ # gather all result part
+ dist.all_gather(part_recv_list, part_send)
+ if rank == 0:
+ part_list = []
+ for recv, shape in zip(part_recv_list, shape_list):
+ part_list.append(
+ pickle.loads(recv[:shape[0]].cpu().numpy().tobytes()))
+ # sort the results
+ ordered_results = []
+ for res in zip(*part_list):
+ ordered_results.extend(list(res))
+ # the dataloader may pad some samples
+ ordered_results = ordered_results[:size]
+ return ordered_results
+import random
+import warnings
+import numpy as np
+import torch
+from mmcv.parallel import MMDataParallel, MMDistributedDataParallel
+from mmcv.runner import (HOOKS, DistSamplerSeedHook, EpochBasedRunner,
+ Fp16OptimizerHook, OptimizerHook, build_optimizer,
+ build_runner)
+from mmcv.utils import build_from_cfg
+from mmdet.core import DistEvalHook, EvalHook
+from mmdet.datasets import (build_dataloader, build_dataset,
+ replace_ImageToTensor)
+from mmdet.utils import get_root_logger
+def set_random_seed(seed, deterministic=False):
+ """Set random seed.
+ Args:
+ seed (int): Seed to be used.
+ deterministic (bool): Whether to set the deterministic option for
+ CUDNN backend, i.e., set `torch.backends.cudnn.deterministic`
+ to True and `torch.backends.cudnn.benchmark` to False.
+ Default: False.
+ """
+ random.seed(seed)
+ np.random.seed(seed)
+ torch.manual_seed(seed)
+ torch.cuda.manual_seed_all(seed)
+ if deterministic:
+ torch.backends.cudnn.deterministic = True
+ torch.backends.cudnn.benchmark = False
+def train_detector(model,
+ dataset,
+ cfg,
+ distributed=False,
+ validate=False,
+ timestamp=None,
+ meta=None):
+ logger = get_root_logger(cfg.log_level)
+ # prepare data loaders
+ dataset = dataset if isinstance(dataset, (list, tuple)) else [dataset]
+ if 'imgs_per_gpu' in cfg.data:
+ logger.warning('"imgs_per_gpu" is deprecated in MMDet V2.0. '
+ 'Please use "samples_per_gpu" instead')
+ if 'samples_per_gpu' in cfg.data:
+ logger.warning(
+ f'Got "imgs_per_gpu"={cfg.data.imgs_per_gpu} and '
+ f'"samples_per_gpu"={cfg.data.samples_per_gpu}, "imgs_per_gpu"'
+ f'={cfg.data.imgs_per_gpu} is used in this experiments')
+ else:
+ logger.warning(
+ 'Automatically set "samples_per_gpu"="imgs_per_gpu"='
+ f'{cfg.data.imgs_per_gpu} in this experiments')
+ cfg.data.samples_per_gpu = cfg.data.imgs_per_gpu
+ data_loaders = [
+ build_dataloader(
+ ds,
+ cfg.data.samples_per_gpu,
+ cfg.data.workers_per_gpu,
+ # cfg.gpus will be ignored if distributed
+ len(cfg.gpu_ids),
+ dist=distributed,
+ seed=cfg.seed) for ds in dataset
+ ]
+ # put model on gpus
+ if distributed:
+ find_unused_parameters = cfg.get('find_unused_parameters', False)
+ # Sets the `find_unused_parameters` parameter in
+ # torch.nn.parallel.DistributedDataParallel
+ model = MMDistributedDataParallel(
+ model.cuda(),
+ device_ids=[torch.cuda.current_device()],
+ broadcast_buffers=False,
+ find_unused_parameters=find_unused_parameters)
+ else:
+ model = MMDataParallel(
+ model.cuda(cfg.gpu_ids[0]), device_ids=cfg.gpu_ids)
+ # build runner
+ optimizer = build_optimizer(model, cfg.optimizer)
+ if 'runner' not in cfg:
+ cfg.runner = {
+ 'type': 'EpochBasedRunner',
+ 'max_epochs': cfg.total_epochs
+ }
+ warnings.warn(
+ 'config is now expected to have a `runner` section, '
+ 'please set `runner` in your config.', UserWarning)
+ else:
+ if 'total_epochs' in cfg:
+ assert cfg.total_epochs == cfg.runner.max_epochs
+ runner = build_runner(
+ cfg.runner,
+ default_args=dict(
+ model=model,
+ optimizer=optimizer,
+ work_dir=cfg.work_dir,
+ logger=logger,
+ meta=meta))
+ # an ugly workaround to make .log and .log.json filenames the same
+ runner.timestamp = timestamp
+ # fp16 setting
+ fp16_cfg = cfg.get('fp16', None)
+ if fp16_cfg is not None:
+ optimizer_config = Fp16OptimizerHook(
+ **cfg.optimizer_config, **fp16_cfg, distributed=distributed)
+ elif distributed and 'type' not in cfg.optimizer_config:
+ optimizer_config = OptimizerHook(**cfg.optimizer_config)
+ else:
+ optimizer_config = cfg.optimizer_config
+ # register hooks
+ runner.register_training_hooks(cfg.lr_config, optimizer_config,
+ cfg.checkpoint_config, cfg.log_config,
+ cfg.get('momentum_config', None))
+ if distributed:
+ if isinstance(runner, EpochBasedRunner):
+ runner.register_hook(DistSamplerSeedHook())
+ # register eval hooks
+ if validate:
+ # Support batch_size > 1 in validation
+ val_samples_per_gpu = cfg.data.val.pop('samples_per_gpu', 1)
+ if val_samples_per_gpu > 1:
+ # Replace 'ImageToTensor' to 'DefaultFormatBundle'
+ cfg.data.val.pipeline = replace_ImageToTensor(
+ cfg.data.val.pipeline)
+ val_dataset = build_dataset(cfg.data.val, dict(test_mode=True))
+ val_dataloader = build_dataloader(
+ val_dataset,
+ samples_per_gpu=val_samples_per_gpu,
+ workers_per_gpu=cfg.data.workers_per_gpu,
+ dist=distributed,
+ shuffle=False)
+ eval_cfg = cfg.get('evaluation', {})
+ eval_cfg['by_epoch'] = cfg.runner['type'] != 'IterBasedRunner'
+ eval_hook = DistEvalHook if distributed else EvalHook
+ runner.register_hook(eval_hook(val_dataloader, **eval_cfg))
+ # user-defined hooks
+ if cfg.get('custom_hooks', None):
+ custom_hooks = cfg.custom_hooks
+ assert isinstance(custom_hooks, list), \
+ f'custom_hooks expect list type, but got {type(custom_hooks)}'
+ for hook_cfg in cfg.custom_hooks:
+ assert isinstance(hook_cfg, dict), \
+ 'Each item in custom_hooks expects dict type, but got ' \
+ f'{type(hook_cfg)}'
+ hook_cfg = hook_cfg.copy()
+ priority = hook_cfg.pop('priority', 'NORMAL')
+ hook = build_from_cfg(hook_cfg, HOOKS)
+ runner.register_hook(hook, priority=priority)
+ if cfg.resume_from:
+ runner.resume(cfg.resume_from)
+ elif cfg.load_from:
+ runner.load_checkpoint(cfg.load_from)
+ runner.run(data_loaders, cfg.workflow)
+from .anchor import * # noqa: F401, F403
+from .bbox import * # noqa: F401, F403
+from .evaluation import * # noqa: F401, F403
+from .mask import * # noqa: F401, F403
+from .post_processing import * # noqa: F401, F403
+from .utils import * # noqa: F401, F403
+from .anchor_generator import (AnchorGenerator, LegacyAnchorGenerator,
+ YOLOAnchorGenerator)
+from .builder import ANCHOR_GENERATORS, build_anchor_generator
+from .point_generator import PointGenerator
+from .utils import anchor_inside_flags, calc_region, images_to_levels
+__all__ = [
+ 'AnchorGenerator', 'LegacyAnchorGenerator', 'anchor_inside_flags',
+ 'PointGenerator', 'images_to_levels', 'calc_region',
+ 'build_anchor_generator', 'ANCHOR_GENERATORS', 'YOLOAnchorGenerator'
+import mmcv
+import numpy as np
+import torch
+from torch.nn.modules.utils import _pair
+from .builder import ANCHOR_GENERATORS
+class AnchorGenerator(object):
+ """Standard anchor generator for 2D anchor-based detectors.
+ Args:
+ strides (list[int] | list[tuple[int, int]]): Strides of anchors
+ in multiple feature levels in order (w, h).
+ ratios (list[float]): The list of ratios between the height and width
+ of anchors in a single level.
+ scales (list[int] | None): Anchor scales for anchors in a single level.
+ It cannot be set at the same time if `octave_base_scale` and
+ `scales_per_octave` are set.
+ base_sizes (list[int] | None): The basic sizes
+ of anchors in multiple levels.
+ If None is given, strides will be used as base_sizes.
+ (If strides are non square, the shortest stride is taken.)
+ scale_major (bool): Whether to multiply scales first when generating
+ base anchors. If true, the anchors in the same row will have the
+ same scales. By default it is True in V2.0
+ octave_base_scale (int): The base scale of octave.
+ scales_per_octave (int): Number of scales for each octave.
+ `octave_base_scale` and `scales_per_octave` are usually used in
+ retinanet and the `scales` should be None when they are set.
+ centers (list[tuple[float, float]] | None): The centers of the anchor
+ relative to the feature grid center in multiple feature levels.
+ By default it is set to be None and not used. If a list of tuple of
+ float is given, they will be used to shift the centers of anchors.
+ center_offset (float): The offset of center in proportion to anchors'
+ width and height. By default it is 0 in V2.0.
+ Examples:
+ >>> from mmdet.core import AnchorGenerator
+ >>> self = AnchorGenerator([16], [1.], [1.], [9])
+ >>> all_anchors = self.grid_anchors([(2, 2)], device='cpu')
+ >>> print(all_anchors)
+ [tensor([[-4.5000, -4.5000, 4.5000, 4.5000],
+ [11.5000, -4.5000, 20.5000, 4.5000],
+ [-4.5000, 11.5000, 4.5000, 20.5000],
+ [11.5000, 11.5000, 20.5000, 20.5000]])]
+ >>> self = AnchorGenerator([16, 32], [1.], [1.], [9, 18])
+ >>> all_anchors = self.grid_anchors([(2, 2), (1, 1)], device='cpu')
+ >>> print(all_anchors)
+ [tensor([[-4.5000, -4.5000, 4.5000, 4.5000],
+ [11.5000, -4.5000, 20.5000, 4.5000],
+ [-4.5000, 11.5000, 4.5000, 20.5000],
+ [11.5000, 11.5000, 20.5000, 20.5000]]), \
+ tensor([[-9., -9., 9., 9.]])]
+ """
+ def __init__(self,
+ strides,
+ ratios,
+ scales=None,
+ base_sizes=None,
+ scale_major=True,
+ octave_base_scale=None,
+ scales_per_octave=None,
+ centers=None,
+ center_offset=0.):
+ # check center and center_offset
+ if center_offset != 0:
+ assert centers is None, 'center cannot be set when center_offset' \
+ f'!=0, {centers} is given.'
+ if not (0 <= center_offset <= 1):
+ raise ValueError('center_offset should be in range [0, 1], '
+ f'{center_offset} is given.')
+ if centers is not None:
+ assert len(centers) == len(strides), \
+ 'The number of strides should be the same as centers, got ' \
+ f'{strides} and {centers}'
+ # calculate base sizes of anchors
+ self.strides = [_pair(stride) for stride in strides]
+ self.base_sizes = [min(stride) for stride in self.strides
+ ] if base_sizes is None else base_sizes
+ assert len(self.base_sizes) == len(self.strides), \
+ 'The number of strides should be the same as base sizes, got ' \
+ f'{self.strides} and {self.base_sizes}'
+ # calculate scales of anchors
+ assert ((octave_base_scale is not None
+ and scales_per_octave is not None) ^ (scales is not None)), \
+ 'scales and octave_base_scale with scales_per_octave cannot' \
+ ' be set at the same time'
+ if scales is not None:
+ self.scales = torch.Tensor(scales)
+ elif octave_base_scale is not None and scales_per_octave is not None:
+ octave_scales = np.array(
+ [2**(i / scales_per_octave) for i in range(scales_per_octave)])
+ scales = octave_scales * octave_base_scale
+ self.scales = torch.Tensor(scales)
+ else:
+ raise ValueError('Either scales or octave_base_scale with '
+ 'scales_per_octave should be set')
+ self.octave_base_scale = octave_base_scale
+ self.scales_per_octave = scales_per_octave
+ self.ratios = torch.Tensor(ratios)
+ self.scale_major = scale_major
+ self.centers = centers
+ self.center_offset = center_offset
+ self.base_anchors = self.gen_base_anchors()
+ @property
+ def num_base_anchors(self):
+ """list[int]: total number of base anchors in a feature grid"""
+ return [base_anchors.size(0) for base_anchors in self.base_anchors]
+ @property
+ def num_levels(self):
+ """int: number of feature levels that the generator will be applied"""
+ return len(self.strides)
+ def gen_base_anchors(self):
+ """Generate base anchors.
+ Returns:
+ list(torch.Tensor): Base anchors of a feature grid in multiple \
+ feature levels.
+ """
+ multi_level_base_anchors = []
+ for i, base_size in enumerate(self.base_sizes):
+ center = None
+ if self.centers is not None:
+ center = self.centers[i]
+ multi_level_base_anchors.append(
+ self.gen_single_level_base_anchors(
+ base_size,
+ scales=self.scales,
+ ratios=self.ratios,
+ center=center))
+ return multi_level_base_anchors
+ def gen_single_level_base_anchors(self,
+ base_size,
+ scales,
+ ratios,
+ center=None):
+ """Generate base anchors of a single level.
+ Args:
+ base_size (int | float): Basic size of an anchor.
+ scales (torch.Tensor): Scales of the anchor.
+ ratios (torch.Tensor): The ratio between between the height
+ and width of anchors in a single level.
+ center (tuple[float], optional): The center of the base anchor
+ related to a single feature grid. Defaults to None.
+ Returns:
+ torch.Tensor: Anchors in a single-level feature maps.
+ """
+ w = base_size
+ h = base_size
+ if center is None:
+ x_center = self.center_offset * w
+ y_center = self.center_offset * h
+ else:
+ x_center, y_center = center
+ h_ratios = torch.sqrt(ratios)
+ w_ratios = 1 / h_ratios
+ if self.scale_major:
+ ws = (w * w_ratios[:, None] * scales[None, :]).view(-1)
+ hs = (h * h_ratios[:, None] * scales[None, :]).view(-1)
+ else:
+ ws = (w * scales[:, None] * w_ratios[None, :]).view(-1)
+ hs = (h * scales[:, None] * h_ratios[None, :]).view(-1)
+ # use float anchor and the anchor's center is aligned with the
+ # pixel center
+ base_anchors = [
+ x_center - 0.5 * ws, y_center - 0.5 * hs, x_center + 0.5 * ws,
+ y_center + 0.5 * hs
+ ]
+ base_anchors = torch.stack(base_anchors, dim=-1)
+ return base_anchors
+ def _meshgrid(self, x, y, row_major=True):
+ """Generate mesh grid of x and y.
+ Args:
+ x (torch.Tensor): Grids of x dimension.
+ y (torch.Tensor): Grids of y dimension.
+ row_major (bool, optional): Whether to return y grids first.
+ Defaults to True.
+ Returns:
+ tuple[torch.Tensor]: The mesh grids of x and y.
+ """
+ # use shape instead of len to keep tracing while exporting to onnx
+ xx = x.repeat(y.shape[0])
+ yy = y.view(-1, 1).repeat(1, x.shape[0]).view(-1)
+ if row_major:
+ return xx, yy
+ else:
+ return yy, xx
+ def grid_anchors(self, featmap_sizes, device='cuda'):
+ """Generate grid anchors in multiple feature levels.
+ Args:
+ featmap_sizes (list[tuple]): List of feature map sizes in
+ multiple feature levels.
+ device (str): Device where the anchors will be put on.
+ Return:
+ list[torch.Tensor]: Anchors in multiple feature levels. \
+ The sizes of each tensor should be [N, 4], where \
+ N = width * height * num_base_anchors, width and height \
+ are the sizes of the corresponding feature level, \
+ num_base_anchors is the number of anchors for that level.
+ """
+ assert self.num_levels == len(featmap_sizes)
+ multi_level_anchors = []
+ for i in range(self.num_levels):
+ anchors = self.single_level_grid_anchors(
+ self.base_anchors[i].to(device),
+ featmap_sizes[i],
+ self.strides[i],
+ device=device)
+ multi_level_anchors.append(anchors)
+ return multi_level_anchors
+ def single_level_grid_anchors(self,
+ base_anchors,
+ featmap_size,
+ stride=(16, 16),
+ device='cuda'):
+ """Generate grid anchors of a single level.
+ Note:
+ This function is usually called by method ``self.grid_anchors``.
+ Args:
+ base_anchors (torch.Tensor): The base anchors of a feature grid.
+ featmap_size (tuple[int]): Size of the feature maps.
+ stride (tuple[int], optional): Stride of the feature map in order
+ (w, h). Defaults to (16, 16).
+ device (str, optional): Device the tensor will be put on.
+ Defaults to 'cuda'.
+ Returns:
+ torch.Tensor: Anchors in the overall feature maps.
+ """
+ # keep as Tensor, so that we can covert to ONNX correctly
+ feat_h, feat_w = featmap_size
+ shift_x = torch.arange(0, feat_w, device=device) * stride[0]
+ shift_y = torch.arange(0, feat_h, device=device) * stride[1]
+ shift_xx, shift_yy = self._meshgrid(shift_x, shift_y)
+ shifts = torch.stack([shift_xx, shift_yy, shift_xx, shift_yy], dim=-1)
+ shifts = shifts.type_as(base_anchors)
+ # first feat_w elements correspond to the first row of shifts
+ # add A anchors (1, A, 4) to K shifts (K, 1, 4) to get
+ # shifted anchors (K, A, 4), reshape to (K*A, 4)
+ all_anchors = base_anchors[None, :, :] + shifts[:, None, :]
+ all_anchors = all_anchors.view(-1, 4)
+ # first A rows correspond to A anchors of (0, 0) in feature map,
+ # then (0, 1), (0, 2), ...
+ return all_anchors
+ def valid_flags(self, featmap_sizes, pad_shape, device='cuda'):
+ """Generate valid flags of anchors in multiple feature levels.
+ Args:
+ featmap_sizes (list(tuple)): List of feature map sizes in
+ multiple feature levels.
+ pad_shape (tuple): The padded shape of the image.
+ device (str): Device where the anchors will be put on.
+ Return:
+ list(torch.Tensor): Valid flags of anchors in multiple levels.
+ """
+ assert self.num_levels == len(featmap_sizes)
+ multi_level_flags = []
+ for i in range(self.num_levels):
+ anchor_stride = self.strides[i]
+ feat_h, feat_w = featmap_sizes[i]
+ h, w = pad_shape[:2]
+ valid_feat_h = min(int(np.ceil(h / anchor_stride[1])), feat_h)
+ valid_feat_w = min(int(np.ceil(w / anchor_stride[0])), feat_w)
+ flags = self.single_level_valid_flags((feat_h, feat_w),
+ (valid_feat_h, valid_feat_w),
+ self.num_base_anchors[i],
+ device=device)
+ multi_level_flags.append(flags)
+ return multi_level_flags
+ def single_level_valid_flags(self,
+ featmap_size,
+ valid_size,
+ num_base_anchors,
+ device='cuda'):
+ """Generate the valid flags of anchor in a single feature map.
+ Args:
+ featmap_size (tuple[int]): The size of feature maps.
+ valid_size (tuple[int]): The valid size of the feature maps.
+ num_base_anchors (int): The number of base anchors.
+ device (str, optional): Device where the flags will be put on.
+ Defaults to 'cuda'.
+ Returns:
+ torch.Tensor: The valid flags of each anchor in a single level \
+ feature map.
+ """
+ feat_h, feat_w = featmap_size
+ valid_h, valid_w = valid_size
+ assert valid_h <= feat_h and valid_w <= feat_w
+ valid_x = torch.zeros(feat_w, dtype=torch.bool, device=device)
+ valid_y = torch.zeros(feat_h, dtype=torch.bool, device=device)
+ valid_x[:valid_w] = 1
+ valid_y[:valid_h] = 1
+ valid_xx, valid_yy = self._meshgrid(valid_x, valid_y)
+ valid = valid_xx & valid_yy
+ valid = valid[:, None].expand(valid.size(0),
+ num_base_anchors).contiguous().view(-1)
+ return valid
+ def __repr__(self):
+ """str: a string that describes the module"""
+ indent_str = ' '
+ repr_str = self.__class__.__name__ + '(\n'
+ repr_str += f'{indent_str}strides={self.strides},\n'
+ repr_str += f'{indent_str}ratios={self.ratios},\n'
+ repr_str += f'{indent_str}scales={self.scales},\n'
+ repr_str += f'{indent_str}base_sizes={self.base_sizes},\n'
+ repr_str += f'{indent_str}scale_major={self.scale_major},\n'
+ repr_str += f'{indent_str}octave_base_scale='
+ repr_str += f'{self.octave_base_scale},\n'
+ repr_str += f'{indent_str}scales_per_octave='
+ repr_str += f'{self.scales_per_octave},\n'
+ repr_str += f'{indent_str}num_levels={self.num_levels}\n'
+ repr_str += f'{indent_str}centers={self.centers},\n'
+ repr_str += f'{indent_str}center_offset={self.center_offset})'
+ return repr_str
+class SSDAnchorGenerator(AnchorGenerator):
+ """Anchor generator for SSD.
+ Args:
+ strides (list[int] | list[tuple[int, int]]): Strides of anchors
+ in multiple feature levels.
+ ratios (list[float]): The list of ratios between the height and width
+ of anchors in a single level.
+ basesize_ratio_range (tuple(float)): Ratio range of anchors.
+ input_size (int): Size of feature map, 300 for SSD300,
+ 512 for SSD512.
+ scale_major (bool): Whether to multiply scales first when generating
+ base anchors. If true, the anchors in the same row will have the
+ same scales. It is always set to be False in SSD.
+ """
+ def __init__(self,
+ strides,
+ ratios,
+ basesize_ratio_range,
+ input_size=300,
+ scale_major=True):
+ assert len(strides) == len(ratios)
+ assert mmcv.is_tuple_of(basesize_ratio_range, float)
+ self.strides = [_pair(stride) for stride in strides]
+ self.input_size = input_size
+ self.centers = [(stride[0] / 2., stride[1] / 2.)
+ for stride in self.strides]
+ self.basesize_ratio_range = basesize_ratio_range
+ # calculate anchor ratios and sizes
+ min_ratio, max_ratio = basesize_ratio_range
+ min_ratio = int(min_ratio * 100)
+ max_ratio = int(max_ratio * 100)
+ step = int(np.floor(max_ratio - min_ratio) / (self.num_levels - 2))
+ min_sizes = []
+ max_sizes = []
+ for ratio in range(int(min_ratio), int(max_ratio) + 1, step):
+ min_sizes.append(int(self.input_size * ratio / 100))
+ max_sizes.append(int(self.input_size * (ratio + step) / 100))
+ if self.input_size == 300:
+ if basesize_ratio_range[0] == 0.15: # SSD300 COCO
+ min_sizes.insert(0, int(self.input_size * 7 / 100))
+ max_sizes.insert(0, int(self.input_size * 15 / 100))
+ elif basesize_ratio_range[0] == 0.2: # SSD300 VOC
+ min_sizes.insert(0, int(self.input_size * 10 / 100))
+ max_sizes.insert(0, int(self.input_size * 20 / 100))
+ else:
+ raise ValueError(
+ 'basesize_ratio_range[0] should be either 0.15'
+ 'or 0.2 when input_size is 300, got '
+ f'{basesize_ratio_range[0]}.')
+ elif self.input_size == 512:
+ if basesize_ratio_range[0] == 0.1: # SSD512 COCO
+ min_sizes.insert(0, int(self.input_size * 4 / 100))
+ max_sizes.insert(0, int(self.input_size * 10 / 100))
+ elif basesize_ratio_range[0] == 0.15: # SSD512 VOC
+ min_sizes.insert(0, int(self.input_size * 7 / 100))
+ max_sizes.insert(0, int(self.input_size * 15 / 100))
+ else:
+ raise ValueError('basesize_ratio_range[0] should be either 0.1'
+ 'or 0.15 when input_size is 512, got'
+ f' {basesize_ratio_range[0]}.')
+ else:
+ raise ValueError('Only support 300 or 512 in SSDAnchorGenerator'
+ f', got {self.input_size}.')
+ anchor_ratios = []
+ anchor_scales = []
+ for k in range(len(self.strides)):
+ scales = [1., np.sqrt(max_sizes[k] / min_sizes[k])]
+ anchor_ratio = [1.]
+ for r in ratios[k]:
+ anchor_ratio += [1 / r, r] # 4 or 6 ratio
+ anchor_ratios.append(torch.Tensor(anchor_ratio))
+ anchor_scales.append(torch.Tensor(scales))
+ self.base_sizes = min_sizes
+ self.scales = anchor_scales
+ self.ratios = anchor_ratios
+ self.scale_major = scale_major
+ self.center_offset = 0
+ self.base_anchors = self.gen_base_anchors()
+ def gen_base_anchors(self):
+ """Generate base anchors.
+ Returns:
+ list(torch.Tensor): Base anchors of a feature grid in multiple \
+ feature levels.
+ """
+ multi_level_base_anchors = []
+ for i, base_size in enumerate(self.base_sizes):
+ base_anchors = self.gen_single_level_base_anchors(
+ base_size,
+ scales=self.scales[i],
+ ratios=self.ratios[i],
+ center=self.centers[i])
+ indices = list(range(len(self.ratios[i])))
+ indices.insert(1, len(indices))
+ base_anchors = torch.index_select(base_anchors, 0,
+ torch.LongTensor(indices))
+ multi_level_base_anchors.append(base_anchors)
+ return multi_level_base_anchors
+ def __repr__(self):
+ """str: a string that describes the module"""
+ indent_str = ' '
+ repr_str = self.__class__.__name__ + '(\n'
+ repr_str += f'{indent_str}strides={self.strides},\n'
+ repr_str += f'{indent_str}scales={self.scales},\n'
+ repr_str += f'{indent_str}scale_major={self.scale_major},\n'
+ repr_str += f'{indent_str}input_size={self.input_size},\n'
+ repr_str += f'{indent_str}scales={self.scales},\n'
+ repr_str += f'{indent_str}ratios={self.ratios},\n'
+ repr_str += f'{indent_str}num_levels={self.num_levels},\n'
+ repr_str += f'{indent_str}base_sizes={self.base_sizes},\n'
+ repr_str += f'{indent_str}basesize_ratio_range='
+ repr_str += f'{self.basesize_ratio_range})'
+ return repr_str
+class LegacyAnchorGenerator(AnchorGenerator):
+ """Legacy anchor generator used in MMDetection V1.x.
+ Note:
+ Difference to the V2.0 anchor generator:
+ 1. The center offset of V1.x anchors are set to be 0.5 rather than 0.
+ 2. The width/height are minused by 1 when calculating the anchors' \
+ centers and corners to meet the V1.x coordinate system.
+ 3. The anchors' corners are quantized.
+ Args:
+ strides (list[int] | list[tuple[int]]): Strides of anchors
+ in multiple feature levels.
+ ratios (list[float]): The list of ratios between the height and width
+ of anchors in a single level.
+ scales (list[int] | None): Anchor scales for anchors in a single level.
+ It cannot be set at the same time if `octave_base_scale` and
+ `scales_per_octave` are set.
+ base_sizes (list[int]): The basic sizes of anchors in multiple levels.
+ If None is given, strides will be used to generate base_sizes.
+ scale_major (bool): Whether to multiply scales first when generating
+ base anchors. If true, the anchors in the same row will have the
+ same scales. By default it is True in V2.0
+ octave_base_scale (int): The base scale of octave.
+ scales_per_octave (int): Number of scales for each octave.
+ `octave_base_scale` and `scales_per_octave` are usually used in
+ retinanet and the `scales` should be None when they are set.
+ centers (list[tuple[float, float]] | None): The centers of the anchor
+ relative to the feature grid center in multiple feature levels.
+ By default it is set to be None and not used. It a list of float
+ is given, this list will be used to shift the centers of anchors.
+ center_offset (float): The offset of center in propotion to anchors'
+ width and height. By default it is 0.5 in V2.0 but it should be 0.5
+ in v1.x models.
+ Examples:
+ >>> from mmdet.core import LegacyAnchorGenerator
+ >>> self = LegacyAnchorGenerator(
+ >>> [16], [1.], [1.], [9], center_offset=0.5)
+ >>> all_anchors = self.grid_anchors(((2, 2),), device='cpu')
+ >>> print(all_anchors)
+ [tensor([[ 0., 0., 8., 8.],
+ [16., 0., 24., 8.],
+ [ 0., 16., 8., 24.],
+ [16., 16., 24., 24.]])]
+ """
+ def gen_single_level_base_anchors(self,
+ base_size,
+ scales,
+ ratios,
+ center=None):
+ """Generate base anchors of a single level.
+ Note:
+ The width/height of anchors are minused by 1 when calculating \
+ the centers and corners to meet the V1.x coordinate system.
+ Args:
+ base_size (int | float): Basic size of an anchor.
+ scales (torch.Tensor): Scales of the anchor.
+ ratios (torch.Tensor): The ratio between between the height.
+ and width of anchors in a single level.
+ center (tuple[float], optional): The center of the base anchor
+ related to a single feature grid. Defaults to None.
+ Returns:
+ torch.Tensor: Anchors in a single-level feature map.
+ """
+ w = base_size
+ h = base_size
+ if center is None:
+ x_center = self.center_offset * (w - 1)
+ y_center = self.center_offset * (h - 1)
+ else:
+ x_center, y_center = center
+ h_ratios = torch.sqrt(ratios)
+ w_ratios = 1 / h_ratios
+ if self.scale_major:
+ ws = (w * w_ratios[:, None] * scales[None, :]).view(-1)
+ hs = (h * h_ratios[:, None] * scales[None, :]).view(-1)
+ else:
+ ws = (w * scales[:, None] * w_ratios[None, :]).view(-1)
+ hs = (h * scales[:, None] * h_ratios[None, :]).view(-1)
+ # use float anchor and the anchor's center is aligned with the
+ # pixel center
+ base_anchors = [
+ x_center - 0.5 * (ws - 1), y_center - 0.5 * (hs - 1),
+ x_center + 0.5 * (ws - 1), y_center + 0.5 * (hs - 1)
+ ]
+ base_anchors = torch.stack(base_anchors, dim=-1).round()
+ return base_anchors
+class LegacySSDAnchorGenerator(SSDAnchorGenerator, LegacyAnchorGenerator):
+ """Legacy anchor generator used in MMDetection V1.x.
+ The difference between `LegacySSDAnchorGenerator` and `SSDAnchorGenerator`
+ can be found in `LegacyAnchorGenerator`.
+ """
+ def __init__(self,
+ strides,
+ ratios,
+ basesize_ratio_range,
+ input_size=300,
+ scale_major=True):
+ super(LegacySSDAnchorGenerator,
+ self).__init__(strides, ratios, basesize_ratio_range, input_size,
+ scale_major)
+ self.centers = [((stride - 1) / 2., (stride - 1) / 2.)
+ for stride in strides]
+ self.base_anchors = self.gen_base_anchors()
+class YOLOAnchorGenerator(AnchorGenerator):
+ """Anchor generator for YOLO.
+ Args:
+ strides (list[int] | list[tuple[int, int]]): Strides of anchors
+ in multiple feature levels.
+ base_sizes (list[list[tuple[int, int]]]): The basic sizes
+ of anchors in multiple levels.
+ """
+ def __init__(self, strides, base_sizes):
+ self.strides = [_pair(stride) for stride in strides]
+ self.centers = [(stride[0] / 2., stride[1] / 2.)
+ for stride in self.strides]
+ self.base_sizes = []
+ num_anchor_per_level = len(base_sizes[0])
+ for base_sizes_per_level in base_sizes:
+ assert num_anchor_per_level == len(base_sizes_per_level)
+ self.base_sizes.append(
+ [_pair(base_size) for base_size in base_sizes_per_level])
+ self.base_anchors = self.gen_base_anchors()
+ @property
+ def num_levels(self):
+ """int: number of feature levels that the generator will be applied"""
+ return len(self.base_sizes)
+ def gen_base_anchors(self):
+ """Generate base anchors.
+ Returns:
+ list(torch.Tensor): Base anchors of a feature grid in multiple \
+ feature levels.
+ """
+ multi_level_base_anchors = []
+ for i, base_sizes_per_level in enumerate(self.base_sizes):
+ center = None
+ if self.centers is not None:
+ center = self.centers[i]
+ multi_level_base_anchors.append(
+ self.gen_single_level_base_anchors(base_sizes_per_level,
+ center))
+ return multi_level_base_anchors
+ def gen_single_level_base_anchors(self, base_sizes_per_level, center=None):
+ """Generate base anchors of a single level.
+ Args:
+ base_sizes_per_level (list[tuple[int, int]]): Basic sizes of
+ anchors.
+ center (tuple[float], optional): The center of the base anchor
+ related to a single feature grid. Defaults to None.
+ Returns:
+ torch.Tensor: Anchors in a single-level feature maps.
+ """
+ x_center, y_center = center
+ base_anchors = []
+ for base_size in base_sizes_per_level:
+ w, h = base_size
+ # use float anchor and the anchor's center is aligned with the
+ # pixel center
+ base_anchor = torch.Tensor([
+ x_center - 0.5 * w, y_center - 0.5 * h, x_center + 0.5 * w,
+ y_center + 0.5 * h
+ ])
+ base_anchors.append(base_anchor)
+ base_anchors = torch.stack(base_anchors, dim=0)
+ return base_anchors
+ def responsible_flags(self, featmap_sizes, gt_bboxes, device='cuda'):
+ """Generate responsible anchor flags of grid cells in multiple scales.
+ Args:
+ featmap_sizes (list(tuple)): List of feature map sizes in multiple
+ feature levels.
+ gt_bboxes (Tensor): Ground truth boxes, shape (n, 4).
+ device (str): Device where the anchors will be put on.
+ Return:
+ list(torch.Tensor): responsible flags of anchors in multiple level
+ """
+ assert self.num_levels == len(featmap_sizes)
+ multi_level_responsible_flags = []
+ for i in range(self.num_levels):
+ anchor_stride = self.strides[i]
+ flags = self.single_level_responsible_flags(
+ featmap_sizes[i],
+ gt_bboxes,
+ anchor_stride,
+ self.num_base_anchors[i],
+ device=device)
+ multi_level_responsible_flags.append(flags)
+ return multi_level_responsible_flags
+ def single_level_responsible_flags(self,
+ featmap_size,
+ gt_bboxes,
+ stride,
+ num_base_anchors,
+ device='cuda'):
+ """Generate the responsible flags of anchor in a single feature map.
+ Args:
+ featmap_size (tuple[int]): The size of feature maps.
+ gt_bboxes (Tensor): Ground truth boxes, shape (n, 4).
+ stride (tuple(int)): stride of current level
+ num_base_anchors (int): The number of base anchors.
+ device (str, optional): Device where the flags will be put on.
+ Defaults to 'cuda'.
+ Returns:
+ torch.Tensor: The valid flags of each anchor in a single level \
+ feature map.
+ """
+ feat_h, feat_w = featmap_size
+ gt_bboxes_cx = ((gt_bboxes[:, 0] + gt_bboxes[:, 2]) * 0.5).to(device)
+ gt_bboxes_cy = ((gt_bboxes[:, 1] + gt_bboxes[:, 3]) * 0.5).to(device)
+ gt_bboxes_grid_x = torch.floor(gt_bboxes_cx / stride[0]).long()
+ gt_bboxes_grid_y = torch.floor(gt_bboxes_cy / stride[1]).long()
+ # row major indexing
+ gt_bboxes_grid_idx = gt_bboxes_grid_y * feat_w + gt_bboxes_grid_x
+ responsible_grid = torch.zeros(
+ feat_h * feat_w, dtype=torch.uint8, device=device)
+ responsible_grid[gt_bboxes_grid_idx] = 1
+ responsible_grid = responsible_grid[:, None].expand(
+ responsible_grid.size(0), num_base_anchors).contiguous().view(-1)
+ return responsible_grid
+from mmcv.utils import Registry, build_from_cfg
+ANCHOR_GENERATORS = Registry('Anchor generator')
+def build_anchor_generator(cfg, default_args=None):
+ return build_from_cfg(cfg, ANCHOR_GENERATORS, default_args)
+import torch
+from .builder import ANCHOR_GENERATORS
+class PointGenerator(object):
+ def _meshgrid(self, x, y, row_major=True):
+ xx = x.repeat(len(y))
+ yy = y.view(-1, 1).repeat(1, len(x)).view(-1)
+ if row_major:
+ return xx, yy
+ else:
+ return yy, xx
+ def grid_points(self, featmap_size, stride=16, device='cuda'):
+ feat_h, feat_w = featmap_size
+ shift_x = torch.arange(0., feat_w, device=device) * stride
+ shift_y = torch.arange(0., feat_h, device=device) * stride
+ shift_xx, shift_yy = self._meshgrid(shift_x, shift_y)
+ stride = shift_x.new_full((shift_xx.shape[0], ), stride)
+ shifts = torch.stack([shift_xx, shift_yy, stride], dim=-1)
+ all_points = shifts.to(device)
+ return all_points
+ def valid_flags(self, featmap_size, valid_size, device='cuda'):
+ feat_h, feat_w = featmap_size
+ valid_h, valid_w = valid_size
+ assert valid_h <= feat_h and valid_w <= feat_w
+ valid_x = torch.zeros(feat_w, dtype=torch.bool, device=device)
+ valid_y = torch.zeros(feat_h, dtype=torch.bool, device=device)
+ valid_x[:valid_w] = 1
+ valid_y[:valid_h] = 1
+ valid_xx, valid_yy = self._meshgrid(valid_x, valid_y)
+ valid = valid_xx & valid_yy
+ return valid
+import torch
+def images_to_levels(target, num_levels):
+ """Convert targets by image to targets by feature level.
+ [target_img0, target_img1] -> [target_level0, target_level1, ...]
+ """
+ target = torch.stack(target, 0)
+ level_targets = []
+ start = 0
+ for n in num_levels:
+ end = start + n
+ # level_targets.append(target[:, start:end].squeeze(0))
+ level_targets.append(target[:, start:end])
+ start = end
+ return level_targets
+def anchor_inside_flags(flat_anchors,
+ valid_flags,
+ img_shape,
+ allowed_border=0):
+ """Check whether the anchors are inside the border.
+ Args:
+ flat_anchors (torch.Tensor): Flatten anchors, shape (n, 4).
+ valid_flags (torch.Tensor): An existing valid flags of anchors.
+ img_shape (tuple(int)): Shape of current image.
+ allowed_border (int, optional): The border to allow the valid anchor.
+ Defaults to 0.
+ Returns:
+ torch.Tensor: Flags indicating whether the anchors are inside a \
+ valid range.
+ """
+ img_h, img_w = img_shape[:2]
+ if allowed_border >= 0:
+ inside_flags = valid_flags & \
+ (flat_anchors[:, 0] >= -allowed_border) & \
+ (flat_anchors[:, 1] >= -allowed_border) & \
+ (flat_anchors[:, 2] < img_w + allowed_border) & \
+ (flat_anchors[:, 3] < img_h + allowed_border)
+ else:
+ inside_flags = valid_flags
+ return inside_flags
+def calc_region(bbox, ratio, featmap_size=None):
+ """Calculate a proportional bbox region.
+ The bbox center are fixed and the new h' and w' is h * ratio and w * ratio.
+ Args:
+ bbox (Tensor): Bboxes to calculate regions, shape (n, 4).
+ ratio (float): Ratio of the output region.
+ featmap_size (tuple): Feature map size used for clipping the boundary.
+ Returns:
+ tuple: x1, y1, x2, y2
+ """
+ x1 = torch.round((1 - ratio) * bbox[0] + ratio * bbox[2]).long()
+ y1 = torch.round((1 - ratio) * bbox[1] + ratio * bbox[3]).long()
+ x2 = torch.round(ratio * bbox[0] + (1 - ratio) * bbox[2]).long()
+ y2 = torch.round(ratio * bbox[1] + (1 - ratio) * bbox[3]).long()
+ if featmap_size is not None:
+ x1 = x1.clamp(min=0, max=featmap_size[1])
+ y1 = y1.clamp(min=0, max=featmap_size[0])
+ x2 = x2.clamp(min=0, max=featmap_size[1])
+ y2 = y2.clamp(min=0, max=featmap_size[0])
+ return (x1, y1, x2, y2)
+from .assigners import (AssignResult, BaseAssigner, CenterRegionAssigner,
+ MaxIoUAssigner, RegionAssigner)
+from .builder import build_assigner, build_bbox_coder, build_sampler
+from .coder import (BaseBBoxCoder, DeltaXYWHBBoxCoder, PseudoBBoxCoder,
+ TBLRBBoxCoder)
+from .iou_calculators import BboxOverlaps2D, bbox_overlaps
+from .samplers import (BaseSampler, CombinedSampler,
+ InstanceBalancedPosSampler, IoUBalancedNegSampler,
+ OHEMSampler, PseudoSampler, RandomSampler,
+ SamplingResult, ScoreHLRSampler)
+from .transforms import (bbox2distance, bbox2result, bbox2roi,
+ bbox_cxcywh_to_xyxy, bbox_flip, bbox_mapping,
+ bbox_mapping_back, bbox_rescale, bbox_xyxy_to_cxcywh,
+ distance2bbox, roi2bbox)
+__all__ = [
+ 'bbox_overlaps', 'BboxOverlaps2D', 'BaseAssigner', 'MaxIoUAssigner',
+ 'AssignResult', 'BaseSampler', 'PseudoSampler', 'RandomSampler',
+ 'InstanceBalancedPosSampler', 'IoUBalancedNegSampler', 'CombinedSampler',
+ 'OHEMSampler', 'SamplingResult', 'ScoreHLRSampler', 'build_assigner',
+ 'build_sampler', 'bbox_flip', 'bbox_mapping', 'bbox_mapping_back',
+ 'bbox2roi', 'roi2bbox', 'bbox2result', 'distance2bbox', 'bbox2distance',
+ 'build_bbox_coder', 'BaseBBoxCoder', 'PseudoBBoxCoder',
+ 'DeltaXYWHBBoxCoder', 'TBLRBBoxCoder', 'CenterRegionAssigner',
+ 'bbox_rescale', 'bbox_cxcywh_to_xyxy', 'bbox_xyxy_to_cxcywh',
+ 'RegionAssigner'
+from .approx_max_iou_assigner import ApproxMaxIoUAssigner
+from .assign_result import AssignResult
+from .atss_assigner import ATSSAssigner
+from .base_assigner import BaseAssigner
+from .center_region_assigner import CenterRegionAssigner
+from .grid_assigner import GridAssigner
+from .hungarian_assigner import HungarianAssigner
+from .max_iou_assigner import MaxIoUAssigner
+from .point_assigner import PointAssigner
+from .region_assigner import RegionAssigner
+from .uniform_assigner import UniformAssigner
+__all__ = [
+ 'BaseAssigner', 'MaxIoUAssigner', 'ApproxMaxIoUAssigner', 'AssignResult',
+ 'PointAssigner', 'ATSSAssigner', 'CenterRegionAssigner', 'GridAssigner',
+ 'HungarianAssigner', 'RegionAssigner', 'UniformAssigner'
+import torch
+from ..builder import BBOX_ASSIGNERS
+from ..iou_calculators import build_iou_calculator
+from .max_iou_assigner import MaxIoUAssigner
+class ApproxMaxIoUAssigner(MaxIoUAssigner):
+ """Assign a corresponding gt bbox or background to each bbox.
+ Each proposals will be assigned with an integer indicating the ground truth
+ index. (semi-positive index: gt label (0-based), -1: background)
+ - -1: negative sample, no assigned gt
+ - semi-positive integer: positive sample, index (0-based) of assigned gt
+ Args:
+ pos_iou_thr (float): IoU threshold for positive bboxes.
+ neg_iou_thr (float or tuple): IoU threshold for negative bboxes.
+ min_pos_iou (float): Minimum iou for a bbox to be considered as a
+ positive bbox. Positive samples can have smaller IoU than
+ pos_iou_thr due to the 4th step (assign max IoU sample to each gt).
+ gt_max_assign_all (bool): Whether to assign all bboxes with the same
+ highest overlap with some gt to that gt.
+ ignore_iof_thr (float): IoF threshold for ignoring bboxes (if
+ `gt_bboxes_ignore` is specified). Negative values mean not
+ ignoring any bboxes.
+ ignore_wrt_candidates (bool): Whether to compute the iof between
+ `bboxes` and `gt_bboxes_ignore`, or the contrary.
+ match_low_quality (bool): Whether to allow quality matches. This is
+ usually allowed for RPN and single stage detectors, but not allowed
+ in the second stage.
+ gpu_assign_thr (int): The upper bound of the number of GT for GPU
+ assign. When the number of gt is above this threshold, will assign
+ on CPU device. Negative values mean not assign on CPU.
+ """
+ def __init__(self,
+ pos_iou_thr,
+ neg_iou_thr,
+ min_pos_iou=.0,
+ gt_max_assign_all=True,
+ ignore_iof_thr=-1,
+ ignore_wrt_candidates=True,
+ match_low_quality=True,
+ gpu_assign_thr=-1,
+ iou_calculator=dict(type='BboxOverlaps2D')):
+ self.pos_iou_thr = pos_iou_thr
+ self.neg_iou_thr = neg_iou_thr
+ self.min_pos_iou = min_pos_iou
+ self.gt_max_assign_all = gt_max_assign_all
+ self.ignore_iof_thr = ignore_iof_thr
+ self.ignore_wrt_candidates = ignore_wrt_candidates
+ self.gpu_assign_thr = gpu_assign_thr
+ self.match_low_quality = match_low_quality
+ self.iou_calculator = build_iou_calculator(iou_calculator)
+ def assign(self,
+ approxs,
+ squares,
+ approxs_per_octave,
+ gt_bboxes,
+ gt_bboxes_ignore=None,
+ gt_labels=None):
+ """Assign gt to approxs.
+ This method assign a gt bbox to each group of approxs (bboxes),
+ each group of approxs is represent by a base approx (bbox) and
+ will be assigned with -1, or a semi-positive number.
+ background_label (-1) means negative sample,
+ semi-positive number is the index (0-based) of assigned gt.
+ The assignment is done in following steps, the order matters.
+ 1. assign every bbox to background_label (-1)
+ 2. use the max IoU of each group of approxs to assign
+ 2. assign proposals whose iou with all gts < neg_iou_thr to background
+ 3. for each bbox, if the iou with its nearest gt >= pos_iou_thr,
+ assign it to that bbox
+ 4. for each gt bbox, assign its nearest proposals (may be more than
+ one) to itself
+ Args:
+ approxs (Tensor): Bounding boxes to be assigned,
+ shape(approxs_per_octave*n, 4).
+ squares (Tensor): Base Bounding boxes to be assigned,
+ shape(n, 4).
+ approxs_per_octave (int): number of approxs per octave
+ gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4).
+ gt_bboxes_ignore (Tensor, optional): Ground truth bboxes that are
+ labelled as `ignored`, e.g., crowd boxes in COCO.
+ gt_labels (Tensor, optional): Label of gt_bboxes, shape (k, ).
+ Returns:
+ :obj:`AssignResult`: The assign result.
+ """
+ num_squares = squares.size(0)
+ num_gts = gt_bboxes.size(0)
+ if num_squares == 0 or num_gts == 0:
+ # No predictions and/or truth, return empty assignment
+ overlaps = approxs.new(num_gts, num_squares)
+ assign_result = self.assign_wrt_overlaps(overlaps, gt_labels)
+ return assign_result
+ # re-organize anchors by approxs_per_octave x num_squares
+ approxs = torch.transpose(
+ approxs.view(num_squares, approxs_per_octave, 4), 0,
+ 1).contiguous().view(-1, 4)
+ assign_on_cpu = True if (self.gpu_assign_thr > 0) and (
+ num_gts > self.gpu_assign_thr) else False
+ # compute overlap and assign gt on CPU when number of GT is large
+ if assign_on_cpu:
+ device = approxs.device
+ approxs = approxs.cpu()
+ gt_bboxes = gt_bboxes.cpu()
+ if gt_bboxes_ignore is not None:
+ gt_bboxes_ignore = gt_bboxes_ignore.cpu()
+ if gt_labels is not None:
+ gt_labels = gt_labels.cpu()
+ all_overlaps = self.iou_calculator(approxs, gt_bboxes)
+ overlaps, _ = all_overlaps.view(approxs_per_octave, num_squares,
+ num_gts).max(dim=0)
+ overlaps = torch.transpose(overlaps, 0, 1)
+ if (self.ignore_iof_thr > 0 and gt_bboxes_ignore is not None
+ and gt_bboxes_ignore.numel() > 0 and squares.numel() > 0):
+ if self.ignore_wrt_candidates:
+ ignore_overlaps = self.iou_calculator(
+ squares, gt_bboxes_ignore, mode='iof')
+ ignore_max_overlaps, _ = ignore_overlaps.max(dim=1)
+ else:
+ ignore_overlaps = self.iou_calculator(
+ gt_bboxes_ignore, squares, mode='iof')
+ ignore_max_overlaps, _ = ignore_overlaps.max(dim=0)
+ overlaps[:, ignore_max_overlaps > self.ignore_iof_thr] = -1
+ assign_result = self.assign_wrt_overlaps(overlaps, gt_labels)
+ if assign_on_cpu:
+ assign_result.gt_inds = assign_result.gt_inds.to(device)
+ assign_result.max_overlaps = assign_result.max_overlaps.to(device)
+ if assign_result.labels is not None:
+ assign_result.labels = assign_result.labels.to(device)
+ return assign_result
+import torch
+from mmdet.utils import util_mixins
+class AssignResult(util_mixins.NiceRepr):
+ """Stores assignments between predicted and truth boxes.
+ Attributes:
+ num_gts (int): the number of truth boxes considered when computing this
+ assignment
+ gt_inds (LongTensor): for each predicted box indicates the 1-based
+ index of the assigned truth box. 0 means unassigned and -1 means
+ ignore.
+ max_overlaps (FloatTensor): the iou between the predicted box and its
+ assigned truth box.
+ labels (None | LongTensor): If specified, for each predicted box
+ indicates the category label of the assigned truth box.
+ Example:
+ >>> # An assign result between 4 predicted boxes and 9 true boxes
+ >>> # where only two boxes were assigned.
+ >>> num_gts = 9
+ >>> max_overlaps = torch.LongTensor([0, .5, .9, 0])
+ >>> gt_inds = torch.LongTensor([-1, 1, 2, 0])
+ >>> labels = torch.LongTensor([0, 3, 4, 0])
+ >>> self = AssignResult(num_gts, gt_inds, max_overlaps, labels)
+ >>> print(str(self)) # xdoctest: +IGNORE_WANT