Feature Selective Anchor-Free Module for Single-Shot Object Detection

FSAF is an anchor-free method published in CVPR2019 (https://arxiv.org/pdf/1903.00621.pdf). Actually it is equivalent to the anchor-based method with only one anchor at each feature map position in each FPN level. And this is how we implemented it. Only the anchor-free branch is released for its better compatibility with the current framework and less computational budget.

In the original paper, feature maps within the central 0.2-0.5 area of a gt box are tagged as ignored. However, it is empirically found that a hard threshold (0.2-0.2) gives a further gain on the performance. (see the table below)

Main Results

Results on R50/R101/X101-FPN

Backbone	ignore range	ms-train	Lr schd	Train Mem (GB)	Train time (s/iter)	Inf time (fps)	box AP	Config	Download
R-50	0.2-0.5	N	1x	3.15	0.43	12.3	36.0 (35.9)		model \| log
R-50	0.2-0.2	N	1x	3.15	0.43	13.0	37.4	config	model \| log
R-101	0.2-0.2	N	1x	5.08	0.58	10.8	39.3 (37.9)	config	model \| log
X-101	0.2-0.2	N	1x	9.38	1.23	5.6	42.4 (41.0)	config	model \| log

Notes:

1x means the model is trained for 12 epochs.
AP values in the brackets represent those reported in the original paper.
All results are obtained with a single model and single-scale test.
X-101 backbone represents ResNext-101-64x4d.
All pretrained backbones use pytorch style.
All models are trained on 8 Titan-XP gpus and tested on a single gpu.

Citations

BibTeX reference is as follows.

@inproceedings{zhu2019feature,
  title={Feature Selective Anchor-Free Module for Single-Shot Object Detection},
  author={Zhu, Chenchen and He, Yihui and Savvides, Marios},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={840--849},
  year={2019}
}